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Abstract. Classical probability theory supports probability measures, assigning a fixed positive real value to 
each event, these measures are far from satisfactory in formulating real-life occurrences. The main innovation 
of this paper is the introduction of a new probability measure, enabling varying probabilities that are recorded 
by ring elements to be assigned to events; this measure still provides a Bayesian model, resembling the classical 
probability model. 

By introducing two principles for the possible variation of a probability (also known as uncertainty, ambi- 
guity, or imprecise probability), together with the "correct" algebraic structure allowing the framing of these 
principles, we present the foundations for the theory of phantom probability, generalizing classical probability 
theory in a natural way. This generalization preserves many of the well-known properties, as well as familiar 
distribution functions, of classical probability theory: moments, covariance, moment generating functions, the 
law of large numbers, and the central limit theorem are just a few of the instances demonstrating the concept 
of phantom probability theory. 
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Introduction 



Over the years much effort has been invested in trying human beings have tried to understand aspects of 
probability in which the evaluations of occurrences, as well as their likelihoods of happening, are uncertain. 
Although the terminology for this type of phenomena is varied (uncertainty for physicists, ambiguity for 
economists, imprecise probability for mathematicians, and phantom for us), fundamentally, the absence of 
theory enabling the formulation of such phenomena is a common problem for many fields of study. In this 
paper we introduce a new approach, supported by a novel probability measure, allowing a natural mathematical 
framing of this type of problems. 

Two main principles underlie our approach to treating probability measures associated with varied evalua- 
tions: 

• For each event, the sum of its probability and its possible distortion lies in the real interval [0, 1]; 

• The overall distortions always sum up to 0. 

Having the right algebraic structure, termed here the ring of phantom numbers that naturally records 
probabilities and their oriented variations, these principles lead to the introduction of our new phantom 
probability measure, on which much of the theory of classical probability can be generalized. This gener- 
alization captures both the uncertainty of outcomes and ambiguous likelihoods, and it is still Bayesian. 

The ring PH of phantom numbers consists of elements of the form z = a + pb, each of which is a compound 
of the real term a and the phantom term b (notated, like the complex numbers, by p instead of i), and whose 
operations, addition and multiplication respectively, are 



This arithmetic makes PH suitable for the purpose of carrying a theory of probability. In many ways this ring 
resembles the field of complex numbers, but its arithmetic is different; here p is idempotent, i.e. p 2 = p, 
while i 2 = — 1 for the complexes. Similar structures, though sometimes using different terminology, have been 
studied in the literature, mainly from the abstract point of view of algebra; the innovation of this paper is the 
utilization in probability theory, which requires some special setting like phantom conjugate, reduced elements, 
absolute value, and norm. With these notions suitably defined, the way toward the development of a phantom 
probability theory is prepared. 

One of the main advantages of phantom functions / : PH — > PH, mainly polynomial-like functions, is that 
they can be rewritten as 



where f rc and / are real functions R — > P. We call this property, which plays a main role in our exposition, 
the realization property of phantoms functions. 

With this realization property satisfied, most of the phantom calculations arc reduced simply to the real 
familiar calculations. Moreover, for z = a + pb, the real term a of z is the only argument involved in / re ; this 
shows that when G is a pantomization of a real function g : R — > R, the real component G rc of G is just g. 
Surprisingly, the pantomizations of all classical probability functions (moments, variances, covarianccs, etc.) 
admit the realization property. 

Using the phantom ring structure, together with our measure principles, we keep track of the evolution of 
the classical theory of probability. The leading motif throughout our exposition is that restricting the theory 
to the real terms of all the arguments involved always leaves ones with the well-known classical theory. Given 
this foundation, as well as the appropriate definitions, the probability insights are much clearer and their 
proofs become more transparent. 

The main topics covered by this paper include: 



(oi + p&i) © (a 2 + pb 2 ) 
(oi + p&i) ® (a 2 + pb 2 ) 



(ax + a 2 ) +p(h + b 2 ), 

a.\a 2 + p (ai& 2 + &i«2 + bib 2 ). 



/ = /re + p(/-/re), 
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• Conditional probability, independence, and Bayes' rule; 

• Random variables (discrete, continuous, and multiple); 

• Attributes of random variables: moments, variances, covariances, moment generating functions; 

• Inequalities (appropriately defined); 

• Limit theorems. 

Along our exposition we also provide many examples demonstrating how classical results naturally carry over 
to the phantom framework, further results and applications will be appear in our future papers. 

The fact that the phantom probability space provides a Bayesian probability model paves the way for 
developing a theory of phantom stochastic processes and phantom Markov chains [9] with a view towards 
applications in dynamical systems. 

We use the notion of imprecise probability as a generic term to cover all mathematical models which 
measure chance or uncertainty without sharp numerical probabilities [16] . The known results of past efforts to 
find a theory that frames imprecise probability give only partial or complicated answers. For example, fuzzy 
probability [17] only treats uncertain outcomes but not varying probabilities; conversely, complex probability 
provides a partial answer for deformed probabilities but only for fixed outcomes [TJ |T5] . On the other hand, 
the operator measure theory [T3] is very complicated and not intuitive, while the min-max model [7] is not 
Bayesian and [13] sometimes becomes non-additive. 

These probability theories have a tremendous range of applications, like quantum mechanics, statistics, 
stochastic processes, dynamical systems, game theory, economics, mathematical finance, or decision making 
theory; to name just a few. Our development, together with the attendant examples, which smoothly extend 
the known theories that have already proven to be significant, lead one to believe that phantom theory could 
contribute to these applications, and make for a better understanding of phenomenons that arise in the real 
life. 

1. The phantom ground ring 

1.1. Ground ring structure. The central idea of our new approach is a generalization of the field (R, +, ■ ) of 
real numbers to a ring structure whose binary operations are induced by the familiar addition and multiplication 
of R. Focusing on application to probability theory, to make our exposition clearer, we give the explicit 
description for the certain extension of R of order 1, which is suitable enough for the scope of this paper. For 
the sake of completeness, and in an effort to attract audiences from various fields of study, we recall some of 
the standard algebraic definitions (see [15]) and present the full proofs related to the basics of the algebraic 
structure for the extension of order 1. The more general phantom framework is outlined in the next subsection. 

Set theoretically, our ground ring PH(!)(R), called a ring with phantoms or phantom ring, for short, is 
the Cartesian product IR x R; for simplicity, we write a+pb for a pair (a, b) G PHm(R). We say that PIH( 1 )(R) 
is a phantom ring of order 1 over the reals, and denote it as PH, for short. (The general case of order > 1 
is spelled out later in Subsection 11.21 ) The elements of PH are called phantom numbers, usually denoted 
x,y, z. 

In what follows we use the generic notation that a, b £ R for reals and write z := a + pb for a phantom 
number z; we call a the real term of z while b is termed the phantom term of z. We use the notation 

re(^) := a and ph(z) := b 

for the real term and the phantom term of z = a+pb, respectively. (The reason for calling the second argument 
"phantom" arises from the meaning assigned to this value in the extension of the probability measure, as 
explained in Section [2l) 

The set PH is then equipped with the two binary operations, addition and multiplication, respectively, 

(ai + ph) © (a 2 + pb 2 ) ■= (cti + a 2 ) + p(h + b 2 ), 

(oi + pb\) <g> (a 2 + pb 2 ) := a x a 2 + p {aib 2 + b\a 2 + b\b 2 ) , 

to establish the phantom ring (to be proved next), (PH, ©, ©), with unit 1 := 1 + pO and zero := + pO. 
We write PH X for PH \ {0}, p for pi, and a- pb for a + p(-b). 

Remark 1.1. In general, similar algebraic structures (with different terminologies) are known in the literature, 
mainly for graded algebras or k-algebras in semiring theory, usually applied to a tensor M Cg)*. k, where M is 
a module over semiring k, c.f., |10j . However, as will be seen immediately, in this paper we push the algebraic 
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theory much further for the special case where M = k is a field; then, in this case, k ffifc has a much richer 
structure. Moreover, some of our definitions are unique with the aim of serving applications in probability and 
measure theory. 

Although the multiplication of (PH, ffi, ffi) is somehow reminiscent of the multiplication of the complex 
numbers C, it is different: for the phantoms p = p 2 is multiplicative idempotent, while for the complexes 
i 2 = — 1 is not idempotent. 

Proposition 1.2. (PH x ,ffi) is an Abelian semigroup. 
Proof. Given Zi — a% + phi, where i = 1, 2, 3, we have 

(zi ffi z 2 ) <8> z 3 

= (0102 + p (0162 + ha 2 + bib 2 )) <8> (a 3 + p b 3 ) 

= (aia 2 )a 3 + p ({aia 2 )b 3 + {a x b 2 + b Y a 2 + hb 2 )a 3 + (aib 2 + ha 2 + hb 2 )b 3 ) 

= 01(0203) + P (01(0263) + 01(0203) + Oi(a 2 o 3 ) + bi{b 2 a 3 ) + ai(& 2 6 3 ) + bi{a 2 b 3 ) + 6i(6 2 6 3 )) 

= ai(a 2 a 3 ) + p(ai(a 2 b 3 + b 2 a 3 + b 2 b 3 ) + bi(a 2 a 3 ) + h(a 2 b 3 + b 2 a 3 + b 2 b 3 )) 

= (ax + p b{) ® ((0203) + p (a 2 b 3 + b 2 a 3 + b 2 b 3 )) 

= 21® (z 2 <g> z 3 ), 

which proves associativity. Commutativity is obtained by 

z\ <E> z 2 — aia 2 + p (aib 2 + ha 2 + bib 2 ) = a 2 a x + p (a 2 &i + 6201 + &261) = z 2 ® z\. 

This shows that (PIH,®) is a (multiplicative) Abelian semigroup. □ 

Theorem 1.3. (PH,©,®) is a commutative ring. 

Proof. Since © is defined coordinate- wise, and (R, +) is an (additive) commutative group, it is clear that 
(PH,©) is also a commutative group. The unique additive inverse —z of z = a + pb is 

-z := (-a) + p(-b). 

The pair (PH, ©) is a (multiplicative) Abelian semigroup, by Proposition [Ol so we need to prove the distribu- 
tivity of <g> over ffi: 

zi <g> (z 2 © Z3) = (&! + pb x ) © (a 2 + p6 2 + a 3 + pb 3 ) 

= (ai + pbi) © {{a 2 + 03) + p(b 2 + 63)) 

= ai(a 2 + a 3 ) + p (ai(b 2 + b 3 ) + b\{a 2 + a 3 ) + 61(62 + 63)) 

= aia 2 + a\a 3 + p (ai&2 + 6102 + 6162) + P (0163 + b\a 3 + 6163) 

= (zi © z 2 ) ffi (zi (g) z 3 ). 

All together we have proved that (PH, ffi, ffi) has the structure of a commutative ring. □ 

Note that (PH x ,ffi) is not a group, and thus (PH, ffi, ffi) is not a field, since there are non-zero numbers 
z € PH X without an inverse; for example z = + p b. 

Recalling that a nonzero ring clement z\ is a zero divisor if there exists a nonzero element z 2 such that 
zi ffi z 2 = 0, one observes that the phantom ring (PH, ffi, ffi) is not an integral domain, i.e. it has zero divisors; 
for example 

(0 + pb)®(-b + pb) = + p(0b + b(-b) + bb) = 
and thus + pb and —b + pb are zero divisors. 

Proposition 1.4. All the zero divisors o/(PH,ffi,ffi) are of the form 
(1.1) z = + pa or z = (—a) + pa, 

for some a € R. 

Proof. Assume z\ = a\ + p6i, and z 2 = a 2 + pb 2 are nonzero elements such that z\ ffi z 2 = 0, that is 

(ai + p 61) ffi (ai + p &i) = aia 2 + p (ai&2 + o 2 6i + 61 & 2 ) = 0. 
Suppose ai ^ 0, then by the real term of the product a 2 = 0. So, by the phantom term, we should have 
aib 2 + bib 2 = (ai + &i)&2 = 0. But b 2 ^ 0, since z 2 ^ 0, and thus b\ = —ai. This means that Z\ = a\ — p a l 
and z 2 = + p 6 2 m as required. □ 
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A nonzero element z <G PH X which is not of the form fll.lj) is called a nonzero divisor; the collection of 
all zero divisors in (PH, ffi, ®) is denoted 

Z div (PH) = { z G PH | z is zero divisor }. 

We sometimes write Z^ iv for the union Zdi V U {0}. 

Definition 1.5. The phantom conjugate z of z = a + pb is defined to be 

z := (a + b) — pb. 

The real number 

z := a + b 

is called the (real) reduction of z. 

Having the notion of (real) reduction, we can write the product of two phantom numbers as : 
(1.2) zi ® z 2 = a\a 2 + p {z\z 2 - a\a 2 ) . 



Remark 1.6. By Proposition ]! .4\ one sees that z — a + pb in PH is a zero divisor iff a = or z = 0; when 
both of them are zero then z = 0. Moreover, in this view, given a suitable topology on PH, the complement of 
Z® iv in PH is dense, so we can omit the zero divisor without detracting form the abstract theory. 

One can easily verify the following properties for phantom conjugates and (real) reductions: 

Properties 1.7. For any z = a + pb the following properties are satisfied: 
(1) z = a + p (z - a) = (z - b) + pb, 



(2) zi ® z 2 = Zi © z 2 , 

(3) M = -(*), 



(4) zi (g) z 2 = zi <g> z 2 , 

(5) ph(z ®z) = ph(z ®z) = 0, 

(6) z — z — p b, 

(7) Z\ ® Z2 = z\ + z 2 , 

(8) f7) = -(z), 

(9) zi ® z 2 = z\ ■ z 2 , 

Remark 1.8. The elements of PH can be understood as intervals in P. This means that is an element 
z = a + pb stands for the interval that starts at a and ends at a + b, i.e. the reduction of z. Thus, PH can be 
realized as a ring of intervals, given by: 

g. [oi, ai + h] © [02, a 2 + b 2 ] = [ai + a 2 , a\ + a 2 + b\ + b 2 ]; 

[ai, ai + b{\ (g) [a 2 , a 2 + b 2 ] = [a\a 2 , a\b 2 + a 2 bi + a 2 b 2 ]. 

In order to get a canonical interval representation, z = a + pb is assigned to the half-open interval [a, z). 

In this view, zero divisors are intervals with as one of their endpoints. This view also provides the 
motivation for the definition of the conjugate: z and z represent the same interval but with switched endpoints. 

In fact, (PH,©,®) has a much richer structure than a standard ring; the division is well defined for all 
nonzero divisors in PH X , and each has an inverse. Given a nonzero divisor z £ PH X , we define the multiplicative 
inverse of z to be 

-i 1 , , M) . 1 , , i- b ) 1 , , f 1 1 



(i.4) z ■■= - + p ; ' = - + P"— 1 - = - + p --- 

a a(a + b) a az a \z a 

indeed z _1 is an inverse of z, 



= (a+pb)® (l + p^) 



(-6) 



One can easily verify that z 1 is unique, and that the reduction of an inverse number has the form: 

— r 1 1 



a + b z 

Since the multiplicative inverse, defined only for all z £ PH \ Z" iv (PIH), and the additive inverse are unique, 
we define the division and the substraction, respectively, for (PH,©,0) as 

z\ z 2 := Z\ z^ 1 and z\ z 2 := z\ (— z 2 ), 

where is defined only for a nonzero divisor z 2 ^ Z^ iv . Accordingly, we write 

(1.5) Z10Z2 = hp — -, —rr, 

a 2 a 2 (a 2 + 2 ) 

which leads to the following useful form: 

ai zia 2 - aiz 2 ai / £} a 1 

(1.6) zi0z 2 = hp - = hp 

a 2 a 2 z 2 a 2 \z 2 a 2 

Definition 1.9. A phantom number z = a + pb is said to be positive if a > and b > 0. When a > and 
I > we say £/ia£ z is pseudo positive. If a < and 6 < 0, t/ien z is said to be negative and when a < and 
z < we say that z is pseudo negative. When z is pseudo positive or it is termed pseudo nonnegative, 
and if is pseudo negative or is called pseudo nonpositive. 

Clearly, any positive (negative) phantom number is also pseudo positive (negative). In particular, if z is 
pseudo positive, or pseudo negative, then z ^ Z|j iv , cf. Remark 1 1.61 and is multiplicativcly invcrtiblc. 

Lemma 1.10. Given two pseudo nonnegatives zi,z 2 € PH then: 

(i) Their sum is pseudo nonnegative, 

(ii) Their product is pseudo nonnegative, 

(hi) z 2 is pseudo nonnegative for each z£ PH, 

(iv) When z 2 is pseudo positive, the fraction z\ z 2 is pseudo nonnegative. 
Proof. 

(i) Write z\ + z 2 = (ai + a 2 ) + p (6i + 6 2 ), since ai + a 2 is positive, and 6i > — a\ and 6 2 > — a 2 , the proof 
is clear. 

(ii) By Equation (jl.2[) ziz 2 = aia 2 + p(ziz 2 — aia 2 ). Then, by the hypothesis, aia 2 > and ziz^ = 
aia 2 + zlz^ — aia 2 = ziz^ > 0. 

(hi) Use (ii) with z = z\ = z 2 , or write directly z 2 = a 2 + p (z 2 — a 2 ), so a 2 > and then (z 2 ) = 
a 2 + (a + 6) 2 - a 2 = (a + bf > 0. 

(iv) Writing z\ z 2 as in Equation (|1.6p . 5t and I 1 are (real) positives, and thus 

£i a]_ _ (qi + bi)a 2 - ai(a 2 + 6 2 ) _ 6ia 2 - aib 2 "aia 2 - aib 2 _ ai 
z^ a 2 z 2 a 2 z 2 a 2 _ z^^ a 2 ' 

□ 

Next, we outline the view of our structure in the category of rings. Categorically, we have the trivial 
embedding 

<p:(R,+,-) — > (PH,©,0), 
given by sending ip : a i— > a+ pO. On the other hand, we also have the onto projection 

7T : (PH,©,0) — > (R,+,-), 

given by sending tt : a + pb i— * aa + /36 for some real numbers a and /3. If /3 = 0, the projection is phantom 
forgetful, i.e. tt : z i— » a(re(z)), while 7r is real forgetful when a = 0. 

Remark 1.11. Viewing (PH,ffi,0) as an R-module, we define the scalar multiplication R x PH — > PH as 

r(a + p&) := v?(r) (a + p o), 

for any r G P, which is written as r{a + pb) = (ra) + p(rb), for simplicity. Similarly, we write f + p^ for 
p(j.)<S>(a + pb). 



It is easy to check that the set of real numbers forms a subfield in the ring of phantom numbers (PH, ®, (£>), 
and the phantom numbers whose real term is zero establish an ideal in (PH, ©, <£>). 

1.2. Generalization. In the previous subsection we described the extension of order 1 of the field of real 
numbers. For completeness, we present the general definition of a phantom ring of arbitrary order. 

Given a field IK of characteristic 7^ 2, usually the field R of real numbers, the phantom ring PH( n )(K) of 
order n, or n-phantom ring, for short, is built over the product KX'"XKofn+l copies of IK indexed 
0,1,..., n. Accordingly the elements of PH(„) (IK) are just (n + l)-tuples (xo, x\, . . . , x n ) and (yo, yx, . . . , y n ) 
denoted, respectively as x and y. PH(„)(IK) is then equipped with the following binary operations, addition 
and multiplication, respectively: 

^ 7 - xffiy := {x + yo,x 1 +y 1 , ... ,x n +y n ), 

x®y := (x ya, ■■■ , Xiy~i-\ + y&i-\ + x t yi, ... ,x n y n -i + y„Xi-i + x n y n ), 

where Xi = J2]=o x i and Vi = Sj=o Vj- 

(Note that the notation here is different from that used in the previous subsection, in particular the Xi and 
the yi, i > 1 stand for the phantom terms for the respective level.) 

Numbering the copies of IK sequentially, the first copy Ko is considered as the real part of PH( n )(K) while 
IK;, i > 1, is said to be the phantom of level i of PH(„)(IK). Note that PH( )(IK) is just IK, which is a subfield 
of PH (n) (K). 

Having the operations rigorously defined for any n G N, using the arithmetic defined in (|1.7|) . we can push 
n to infinity and also define the oo-phantom ring PH( 00 )(IK). 

In the sequel, for simplicity, we apply our development only to PH(!)(R), which as we have said is denoted 
PH, though we note that extends smoothly to any PH( n )(IK), with n > 1, defined over a suitable field IK. 
Generalizing the future definitions suitably to n, the n-phantom ring PH(„)(R) carres also the same properties 
as PH(!)(IK), to be described in the next sections. 

Notations: For the rest of this paper, assuming that the reader is familiar with the arithmetical nuances, we 
write Z\ + Z2 for zi ® 22, Z\ — z-2 for zi 22, z\Z2 for the product z\ ® Z2, for the division zi0Z2, and z™ 
for z ® • • • ® z repeated n times. The phantom ring (PH, ®, <g>) is denoted PH, for short. 

1.3. Relations and orders. In the sequel, mainly for the development of phantom probability theory, we 
need some relations that help to utilize the structure of PH. 

To make our paper reasonably self-contained, let us recall the property of a binary relation on a set for 
being an order: 

Definition 1.12. A binary relation is a weak order on a set S if the following properties hold: 

(i) Reflexivity: s ^ w k s for all s G S; 

(ii) Transitivity: si S2 and S2 ^ w k S3 implies si ^ w k S3; 

(iii) Comparability (trichotomy law): for any si,S2 € S, either s\ ^ w k S2 or S2 ^ w k Si- 

(When si ^ w k S2 and S2 ^ w k $i we write si ~ w k S2-) A weakly ordered set is a pair (5, ^ w k) where 
S is a set and ^ w k is a weak order on S . When S consists of phantom numbers we say that (5, ^ w k) is a 
phantom weakly ordered set. 

Adding the extra axiom: 

(iii) Antisymmetry: s± ^j w k S2 and S2 ^ w k si implies s± = S2; 

the order ^j w k is then a total order, or order, for short, and is denoted as < . 

Clearly, ~ w k induces an equivalent relation on PH, the classes of which are PH/^ wk , and when ~ w k is a 
total order ~ w k is replaced by full equality =. We use the notation D w k to distinguish this order, mainly when 
writing -< w k, from the other orders used in the sequel. Therefore, the symbol < and < always denote the usual 
order of the real numbers. 

Although, in general, many relations may serve as weak order on PH, in this paper we require the weak 
order ^ w k to have the following properties: 

Properties 1.13. 



7 



(i) Compatibility with the standard order of the reals, that is 

z\ Sk 22 <f=^ ai < a 2 , 

for any z\ = ai + pO and z 2 = a 2 + pO. 

(ii) Compatibility with the arithmetic operations o/PH: 

(a) z/ zi ^ wk z 2 iften Zi + 23 ^wk 2 2 + z 3 , for any z 3 £ PH; 

(b) i/zi ^ w k ^2 then z\z 3 ^> w k 2223, /or any pseudo positive z 3 £ PH; 

(c) if Zl z 3 < 

vvk Z2 then z\ ;j w k f 2 -, /or any pseudo positive z 3 £ PH. 

Viewing PH as an Euclidian space, we usually assume that all the elements that are ~ w k form a conncctablc 
set. 

Example 1.14. Viewing PH as R x R, our main example for a total order on PH is the lexicographic order 

<i ox defined as 



(1.8) ai + p&i < lcx a 2 + pb 2 

which is a total order satisfying the above conditions. 



d\ < 02, ai ^ a 2 ; 
61 < b 2 , ai = a 2 ; 



In the continuation, when writing ^ w k, we assume the weak order ^ w k is provided with the set structure. 
The reader should keep in mind that one interpretation for an order which is also total is the lexicographic 
order <i ex . 



Remark 1.15. Note that our definition of pseudo positivity, cf. Definition ] 1.91 is independent of the given 
order ^ w k on PH. 

In the sequel, mainly for probability theory, we also use the notation < re for the (real) relation 

(1.9) 21 < re 2 2 re (21) < re(z 2 ) , 

the other real relations = rc , < rc , > ro , and > rc arc defined similarly. 

We define the real-valued function [ ] a : PH — > R, with a real positive parameter a £ R, given by 

(1.10) [ L : a + pb 1 — ► 0+ -, 

a 

and write [z] a for the image of z £ PH in R. Then, [ ] a determines the equivalence relation on PH given by 

21 2 2 [Zl] a = [z 2 ] a , 

and written z\ = a z 2 . (Note that in the special case when a = 2, by this definition, we always have z = a z.) 
The quotient ring of PH, taken with respect to [ ] Q , is denoted as PH/ Q ; clearly PH/ Q = R. 
In the same way, [ } a induces a weak order on PH, provided as 

(1T1) Z\ < a z 2 <=> [z{\ a < [z 2 ] a ; 

and satisfying Propertics ll.131 the relations < Q , > Q , and > Q are determined similarly 

These relations are very important for advanced topics in phantom probability theory and their applications, 
mainly discussed in the sequel papers [H [9] . 

1.4. Powers and exponents. Writing z n , with n £ N, for the product z - ■ ■ z with z repeated n times, for 
any z = a + pb we have 

z" = a" + p£ (")<*"-*&*; 

i=l 

as usual, z° is identified with the unit 1. This form leads to the following friendly formula: 
(1.12) z n = a n + p ((a + b) n - a n ) = a n + p (z n - a 11 ). 

Following accepted standards, we write z~ n for and therefore get the extension to integral powers of 
phantom numbers. 

Equation (|1.12[) plays a main role throughout our development and, together with Equation (|1.2p . leads to 
the next important formula, which is used frequently in the sequel: 

.13) zl z% = a 7 la 2 n + p {z 1 z 2 - a^aJj") . 



(To verify this equality, combine Equation (|1.12[) and Equation ll.2l ) 

Properties 1.16. Given a phantom number z G PH, then: 
(1) zV = z i+j , 



(2) Z i 



1-3 



(3) (z*y = ( z sy = z *s , 

for any i, j G Z. 

Of course, one can take an arbitrary finite number of multiplicands, zx, Z2, . . . , z n , and get recursively 
.14) z^ z 2 ■ ■ ■ z™ = a x a 2 f- ■ ■ a,n + p I z\ z 2 • • • z„ — a x a 2 ■ ■ ■ a 

for any i 1; i 2 , . . . , i„ G Z. 

Definition 1.17. When a phantom equation Q can be written in terms of two real equations, Q re and Q, as 

Q = Qrc + P (Q - Qrc): 

we say £/ia£ Q /ias a realization form, or equivalently, that it admits the realization property. 

For that matter an equation might be an arithmetic expression or a function, where Q re and Q stand respec- 
tively for the real and the reduction of each argument involved in Q. 

For example, Equations (|1.12[1 . (|1.13[1 . and (|1.14[) above admit the realization property. In the sequel, we 
will see that many other familiar equations admit this nice property. Having this property, as spelled out later 
for probability theory, phantom results are induced by known results for reals, which makes the development 
much easier. 

Since the realization property us satisfied for each z G PH and any natural power n G N, cf. Equation 
(j!.12[) . it easy is to determine the n'th root, if it exists, of a phantom number z = a + pb as: 

(1.15) \fz = \fa + p( \fl- tfa), 

where \fa and \fi are, respectively, the real n'th roots of a and z, and n G N is a real positive number. 
Clearly, when n is even, both a and z must be nonnegative. 

In the usual way, we sometimes write z^ for ^/z, and have the properties: 

Properties 1.18. Given pseudo nonnegative phantom numbers z, Z\, and z 2 then: 

(1) {/zTV^2= \/ZiZ 2 , 

(2) yfj = -y= , for pseudo positive z 2 , 

zi) =z^, 

for any positive m,n G N. 

In the specific case when n = 2, clearly, each pseudo nonnegative phantom number z = a + pfe G PH has a 
square root 



yja + pb = y/a + p {-Ja + b - ^Ja) 
(1.16) _ 

= V» + p(Vz- s/a)- 

Actually, y/a in the equation stands for ±y/a; therefore there always exists a nonnegative square root of 
a + p b, i.e. a root whose real and phantom terms are both nonnegative. 

In the standard way, we define the exponent of an element z G PH to be the infinite phantom sum 

z 2 z 3 z 4 
6 =1 + 2+ 2! + 3! + 4! + 
Proposition 1.19. Given z, z\ and z 2 in PH then: 

(1) e° = 1, 

(2) e z = e a + p (e a+b - e a ) , 

(3) e Zl e z 



(4) e 21 / e Zl = e Zl ~ Z2 . 
Proof. (1) is by definition. (2) Expand e z and use Equation (| 1 . 1 2|) . i.e. 

e a +v b = l + (a + pb) + ^§^ + ^±§^+ ■■■ 
= l+a+^ + ^ + --- 

+p((l + (a + 6) + ^ + ^ + "-)-(l + a+i + i + .-.)) 
= e° + p (e a+b - e°) . 

(3) Using the identity in (2), write 

e zi e Z2 = (e ai + p (e ai+bl - e 01 )) (e a2 + p (e a2+b2 - e a2 )) 

= e ai e a2 + p (e ai ( e a2+&2 - e a2 ) + e° 2 ( e ai+61 - e Ql ) + ( e ai+bl - e ai ) ( e Q2+&2 - e a2 )) 

_ e (ai+a 2 + p(b 1 +b 2 ) 

(4) Straightforward from (3) by taking e Zl+Z2 = e Zl+( ~ Z2 \ □ 
Proposition ll.l9l (2) yields the following convenient form. i.e. the realization form, for the phantom exponent: 

(1.17) e z = e a + p (e a+b - e a ) = e a + p (e z - e a ) , 

often used in phantom probability theory. 

Analogously to classical theory, for any pseudo positive z G PH we define the logarithm as 

, . . . [z-tf {z-tf (z-1) 4 
log(z) = (z - 1) - y -^- + - y -^- + • • • , 

where 1 = 1 + p 1 , and prove that 

log(z) = log(a) + p (log(o + b) - log(a)) 
= log(a) + p (log(z) - log(o)), 
that is, the realization property for phantom logarithm. 

1.5. Phantom spaces. Modules over the phantom ring, called phantom modules, are just like standard 
modules over rings |llj . For the reader's convenience we state this explicitly: 

Definition 1.20. A phantom Ph-module V is an additive group (V, ©, OvO together with a scalar multipli- 
cation PDH x V — > V satisfying the following properties for all z £ PH and v,w G V : 

(i) z(v ® w) = zv © zw; 

(ii) (zi © z 2 )v = z\v © z 2 v; 

(iii) {z 1 z 2 )v = zi(z 2 v); 

(iv) lv = v. 

(v) Qv = <D V = zOv 

The direct sum (§)j e j PH of copies (indexed by J) of the phantom ring PH is denoted as PH*-' 7 '', with zero 
element = (0, . . . , 0), and is called the phantom space. When J = {1, . . . , n}, then the phantom space 
PH (J) is denoted as PH (ri) and we say that PH^ is an n-phantom space. As element of PH*-™- 1 is just an 
n-tuple (zi, . . . , z n ) and is denoted as z. 

Denoting the nonnegative real numbers as R+, we recall the standard definition of a norm, formulated for 
the n-phantom space: 

Definition 1.21. A norm on PH*-™^ is a real-valued function || || : PH*-"^ — > R+ that satisfies: 

(i) < ||z|| e R and ||z|| = iff z = 0, 

(ii) ||?*z|| = |r|||z|| for each r g R, 

(iii) ||z'©z"|| < IKII + KH, 

for any z,z',z" G PH (n) . 
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(a) 



(b) 



Figure 1. (a) The iso-norm points on the compactification A of PH. (b) The [ ) a -equivalent 
points, for a := 2, on A. 



In what follows we use the absolute value, also called a modulus, 
(1.18) 



PH 



P+ given by 



\a + pb\ = 



(When z is only a real term, i.e. z = a + pO. This definition coincides with the familiar absolute value of the 
reals.) 

Proposition 1.22. The absolute value \ \ as defined in Equation (| 1 . 1 8[) is a norm on PH. 

Proof, (i) and (ii) are immediate by definitions. To prove (iii), we show that |zi + Z2| 2 < {\ z i\ + l-^l) 2 - 
Expanding both sides of this form, and letting a, = a, + /?, = ^ for i = 1, 2, we have 

(ai + a 2 f + (ft + fcf < aj + f3f + 2 \ Zl \ \z 2 \ +a 2 2 + 
Discard similar components on both sides and write |zi| and \z 2 \ explicitly to get 



2a ia2 + 20102 < 2^a{+(3(^Ja l 2 +(3 2 . 

Canceling the common multipliers and taking squares, we have {a\a 2 + f3if3 2 ) 2 < (a\ + f3f^ (a\ + , and 
thus 

which implies < d\l3 2 —2aia 2 j3ij3 2 +f3'la\ = (a.\fi 2 — f3\ct 2 ) 2 ' ■ This proves property (iii) of Definition 1 1.211 □ 

Using the reduced form of phantom numbers, and Properties 1 1 . 71 ( 1 ) . |z| 2 can be written also as 

a 2 + z 2 



z = a 



V 2 J 2 

Having a weak order satisfying Properties [TTT3] (i), one also has \z\ ^; w k for any z £ PH. 

Remark 1.23. There are several main reasons for defining the absolute value on PH as it has been defined in 
Equation ([TTT8|) : 

(i) \z\ = \z\, for each z £ PH; indeed, to verify this identity, we have 



\z\ = \(a + b)-pb\ = ^/(a + 6-|) 2 +(|) : 



(ii) Considering PDH as R x R, the two point compactification of each copy of R is viewed as a parallelogram 
with vertices (0,0), (0,1), (1,0), and (1,-1) together with the following correspondences: 

(-00,-00) h-> (0,0), (-00,00) i-> (0,1), (oo,-oo)i-^(l,-l), (00,00) h-> (1,0). 

(See Figure^ (a).) 

Accordingly, a + pO is the unique point having absolute value 0, and 1 + pO is the unique point 
having absolute value 1. The significance of this property become apparent later in the discussion on 
the phantom probability measure. 

(iii) The same view of {ii), applied for [ ] a with a = 2, cf. Equation (|1.10[) . shows that the classes [0] Q and 
[l] a in PH/ Q are singletons. (See Figure]]} (b).) 

These properties are very important for applications in phantom probability theory. 

Having the norm | | on PH, we equipped PH with the following relation: 
(1-19) Zl <\\Z2 ^ M < \z»\. 

Accordingly, is the unique minimal clement in PH. (Clearly, this relation is also a weak order on PH; however, 
since <i | ignores signs, it does not satisfy Properties [TTT3] ) 

Basing on Equation (| 1 . 18[) we defined the norm on the n-phantom space PH*-"-' as: 

where z = (zi, . . . , z n ). 
Proposition 1.24. || || is norm. 

Proof. Straightforward from | | being a norm. □ 

We use to || || to define the map d : PH (n) x PH (n) -> R + given by 
(1.20) d : zi x z 2 1 — >||zi-z 2 ||, 

where the substraction is taken coordinate-wise. We write <i(zi, Z2) for the image of Zi x z 2 under d. 
Proposition 1.25. d is a metric on PH*-"- 1 . 

Proof. By Proposition 11.241 we have d(z 1 ,z 2 ) > 0, for any z x and z 2 , and equals iff Zi = z 2 . Symmetry is 
clear. g?(zi,Z3) < g?(zi,z 2 ) + g?(z 2 ,Z3) is derived from the triangular law satisfied by |j ||. □ 

Note that using the metric (|1.20j) we always have 

d(z, z) = \a-(a + b) + p{b- (-b))\ = \-b+p2b\ = b, 
for every z = a + p b in PH. 

Remark 1.26. The fact that PH (n) is metric space allows us to define a Borel a-algebra over PH (n) in the 
usual way. 

1.6. Polynomials. Polynomials over the phantom ring, called phantom polynomials, are defined just as 
formal polynomials over rings [12| . As usual polynomials, say in n phantom variables Ai, . . . , A„, form a ring 
which is denoted as PH[Ai, . . . , A n ]; these polynomials can also be viewed as sums of polynomials in 2n real 
variables. 

Remark 1.27. Given a polynomial f = a^A^ 1 ■ • ■ A^" in PH[Ai, . . . , X„], it can be written as f = /re + p/ph, 
where f Te and / p h are real polynomials. 

Suppose Xi = Ui + pvt. i = 1, . . . , n, is a sum of two variables u and v that take real values, and let 

/ re (ui, . . . , u n ) = ^ re (ai) it} 1 ■ • • u% and /(Ai, . . . , A„) = ^ ^-M 1 ' " " K? 

i i 

be two polynomials over the reals. (Note that, since Xi = u.i + Vi, f is considered as a real polynomial in 2n 
variables.) Then, by Equation (|1.13|) . / is written as 

/(Ai, ...,A„) = f te (ui, ■ ■ ■ ,u n ) + p (f(Xi, ■ ■ ■ , A„) - f le (ux, . . . ,u n )j . 

Therefore, phantom polynomials also admit the realization property, in this case in the sense of functions. 
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The conjugate polynomial / of / = J^i cti^i ' ' ' Ki i s defined as 

/ ».-V ; ..-a;, . 

i 

Proposition 1.28. f(z\, . . . , z n ) = f{z\, . . . , z n ) for any f G PH[Ai, . . . , A„] and each (z\, . . . , z n ) € PH^™- 1 . 

Proof. Straightforward by Properties 11.71 □ 

1.7. Basic analysis. Finally, we provide the necessary notions for basic analysis over the phantoms; we 
present only the general tools needed for our exposition. Most of these notions are the phantom analogues to 
those in complex analysis; in general we adopt the philosophy of analysis over the complexes. 

Definition 1.29. Let Z\,Z2,--- be an infinite sequence of phantom numbers, and let z be another phantom 
number. We say that the sequence z n converges to z, written lim z n = z, if for every real e > there exists 

n — >oo 

some no such that \z n — z\ < e, for all n > uq. 

Lemma 1.30. A sequence Z{ = a* + phi, i = 1, 2, . . . converges to z = a + pb iff lim a n = a and lim b n = b 

n — >oo n — >oo 

as real sequences. 

Proof. (<=) Clear by definition, cf. Equation p. 181) . 

Write lim yj (a n -a+ ^f^f + {^^f = 0. Each, (^) 2 and (a„ -a+^-f is positive and 
converges to 0. Thus, by the latter component, b n — > b. Then, by the first component, a n — > a. □ 

A function / : D — > PH, whose domain is a subset D C PH*-"-', is termed a phantom function, while a 
function g : — > R is called a real function. We say that a function is a phantom-valued function if its 
range lies in PH; similarly a function whose range lies in R is called real- valued, 

Definition 1.31. Given a phantom function f : D — > PH, we say that wq S PH is the limit of f when 
z — > zq G D if for any real e > there exists a real 5 > such that for any z with \z — Zq\ < S we have 
\f(z) — wq\ < e. In such a case we write lim f(z) = wq. 

z—>z 

A function f is continuous at zq G D if lim f(z) = f(zo), and is said to be continuous on D if it is 

Z— >ZQ 

continuous at each zq G D. 

Suppose / : D — *• PH is a phantom function, where D C PH is a set, and zq is an interior point of D. The 
derivative of / at zq is defined as 

f(z ) = lim /(«)-/(*) , 

z-^zo Z — Zq 

provided this limit exists (depending also on z — zq being a nonzero divisor). In this case, / is called differ- 
entiable at zq. If / is differentiable for all points in an open disk centered at zq then / is called analytic at 
zq. The phantom function / is analytic on the open set D C PH if it is differentiable (and hence analytic) at 
every point in D. (The familiar properties of derivation are also satisfied for phantom derivation.) 

Example 1.32. The derivative of a polynomial f = X)i a A l a ^ z 0; where A = u + pv, written as /(A) = 
fre(u) + p(/(A) — fre( u )) by Remark \1.27\ is provided by using Equation (|1.5[) as: 

fit \ lim hdBtJldS°l I a (/(S)-/re(°)-/(^o) + /r.("o))("-"o)-(/ re (a)-/, e ("o))(h-fc ) 

J V °/ z^zo a—a * (a-a a )(a-a +b-b ) 

— f Cn^ ± f, lim (/( ? )-/(5>))( a - a 0) (/rc(a)-/rc(ao))(z-ZQ) 

- /re (O0 ) + P (Q „ Qo)(l _ z - 5) (a-a )(i-fb) 

= /;cK) + p(/?o)-/; c w). 

When a phantom function has the realization property, i.e. it is of the form f le {t) + p/ p h(i), where both 
/ re and / p h are real functions with t G R, the derivative of / is given as 

(1-21) /' = /r'e(*) + P/ph(*)- 

Indeed, write 

(/«.(*) + p/ph(*)) - (/re(io) + p/ph(*o)) 



t— >to t — Zq 
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which by Equation (|1.5[) is 



, = Um Ut) ~ utp) Um (M)-fM) 

t-to i-t *-*o (t-*o) 



Phantom integration is not really anything different from real integration over pathes. For a continuous 
phantom- valued function (f>(t) : [a, b] G R — > PH, where = <fi rc + p<fi p h, we define 

r-b r-b r-b 

(1.22) / (f>{t)dt= / <f> re (t)dt+p / ph (t)di. 

./a J a J a 

For a function which takes phantom numbers as arguments, we integrate over a path 7 (instead of a real 
interval) in PH realized as R x R. If one thinks about the substitution rule for real integrals, the following 
definition, which is based on Equation (|1.22p should come as no surprise. 

Definition 1.33. Suppose 7 is a smooth path parameterized by j(t) : [a, b] — ► PH, a < t < b, where t, a, b G R, 
and / : PH — > PH is a phantom function which is continuous on 7. Then we define the integral of f on 7 as 

(1.23) / f(z)dz = [ /( 7 (i)) 7 '(i)di. 



This is simply the path integral of / along the path 7. This integral can be defined analogously to the 
Riemann integral as the limit of sums of the form XX/ o "f)(Tk)(tk — ifc-i), so is the Ricmann-Stieltjcs integral 
of / o 7 with respect to r. Using this definition, the integral can be extended to rectifiablc paths, i.e. ones for 
which 7 is only of bounded variation. 

Properties 1.34. Suppose 7 is a smooth path, f and g are phantom functions which are continuous on 7, 
and w G PH is constant. 

(1) / 7 (/ + wg)dz = j i fdz + w J^gdz. 

(2) If 7 is parameterized by "f{t), a > t > b, define the path —7 through —"f{t) = (a + b — t) , a > t > b. 
Then J 7 f dz = ~I 1 f( z ) dz - 

(3) J/71 and 72 are paths so that 72 starts where 71 ends then define the curve 7172 by following 71 to its 
end, and then continuing on 72 to its end. Then J f(z)dz = J f(z)dz + / f(z)dz. 

Assume / is given as / ro + p f p h, and is defined along a smooth path 7, given in a parametric form 

7 = {z = z(t) : a<t<0, t G R}, 

for some real a, ft, for z = z TC + pz p h', we also write z 10 = a(t) and z p h = b(t). Then, using the familiar line 
integral from calculus, Equation (|1.23|) is written in the following useful form: 

r rP r0 

(1-24) / f(z)dz= / f ie (t)a'(t)dt + p / f Te (t)b'(t) + f ph (t)(a'(t)+b'(t))dt, 



where f ie (t) and f P h(t) stand respectively for f le (a(t),b(t)) and f p h(a(t) , b(t)) . 

2. Phantom probability spaces 

2.1. Phantom probability laws. We first recall the necessary basics of standard measure theory, then we 
further extend these basics to obtain the phantom setting that generalizes the familiar classical probability 
framework. We use [3l [5l [6] as general references for classical probability theory. 

A measure space is a triple (fl, E, fx), where E is a a-algebra of subsets over a set and \i : E — > [0, 00] 
is a real- valued function, called a measure, that satisfies the properties: 

(i) M(0) = 0; 

(ii) n([J°Z 1 Ai) — YlTLi M(Ai) f° r an y countable sequence A\, A2, A3, . . . of pairwise disjoint sets in E. 
A measure /j, is monotonic if n(A\) < ^(^2) for each A\ C A-x- 

A probability measure is a measure with total measure one (i.e. /i(f2) = 1), cf. [2]; a probability space 
(f2, E, P) is a measure space with a probability measure fi := P that satisfies the additional probability axiom 

P(A) > 0, Vie E. 

When (fl, E, P) is a probability space, P(A) is said to be the probability of A, 57 is called the sample space 
and its elements are called outcomes, usually denoted as ui\, 0J2, A collection of possible outcomes is 
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called an event. In the sequel, mainly in the examples, we use the letter P to denote a standard probability 
measure, i.e. P:S->[0,l]cR. 

Terminology: In what follows, when using the term "standard" , or "real standard" , we refer to the known 
classical results, based on the above (real) probability measure, appearing in the literature on probability 
theory, OH]. 

Roughly speaking, our aim is to generalize the probability measure P : £ — > R to a phantom- valued function 
V : £ — > PH, whose real component is a standard probability measure while its phantom component satisfies an 
extra axiom. One way to realize this extra axiom, enforced only on the phantom component, is to understand 
the phantom as a signed distortion (cither positive or negative) assigned to each evaluation of the probability 
measure. Therefore, given a fixed event, its probability together with an arbitrary distortion should still be 
positive (in the standard sense) and should not exceed 1. 

Remark 2.1. In the continuation the sample space £1 need not be a standard sample space, and is also 
generalized to a phantom sample space - a sample space consisting of phantom elements, called phantom 
outcomes. In what follows, the notation fi is also used for a phantom sample space, and we use the standard 
terminology of outcomes and events, respectively, for elements and subsets offl. 

Recall that PH is assumed to be equipped with a weak order ^ w k, coinciding with the standard order on 
R, usually a total order. However, to ensure that our formalism is abstract enough, we formulate our setting 
in terms of a general phantom weak order ^ w k on PH. 

A phantom-valued function 

V : £ — > PH, 

is called a phantom probability measure if it satisfies the following axioms. We denote the real component 
and the phantom component of V as V Te and V p h, respectively, each being a real- valued function, and write 
V=V IC + pT ph : 

Axiom 2.2 (Phantom probability measure). 

(i) Nonnegativity: < V ve (A) < 1 for each A G £, 

(ii) Normalization: "P(Sl) = 1, 

(iii) Additivity: V(A U B) = V{A) + V(B) for any pair of disjoint events A and B in £, 

(iv) Phantomization: —V rc (A) < V p h(A) < 1 — V rc (A) for each A G £. 
(The order < is the standard order of the real numbers.) 

As one can see, conditions (i)-(iii) are none other than the well known classical probability axioms, referring 
to real component ofV (condition (iii) is also imposed on "P p h), while the extra axiom (iv) is enforced on the 
phantom component. (This axiom can be equivalently written as < V T<i (A) + Vph(A) < 1.) Therefore, the 
real component V rc of any phantom probability measure V is always a standard (real) probability measure. 
These axioms properly frame our earlier probability principles. 

Let A C PH be the set 

(2.1) A = {z e PH | ae [0,1], -a<b< 1- a}, 

each of whose points has a real term belonging to real interval A ro = [0, 1] C R and a phantom term limited 
to the interval [— a, 1 — a], conditional on the real term of the points. The set A is called the phantom 
probability zone, all of whose elements are pseudo positive. 

Remark 2.3. In order to define our probability theory appropriately, we need to enforce the following require- 
ment on the weak order provided with PH: 

(2.2) -< w k z -< wk 1, for each z G A. 

For example, < Q with a > 1 (cf. Equation (jl.lip ). or <i i (cf. Equation (| 1 . 1 0[) ). are weak orders that 
satisfy this condition. The total order <i ox ( cf. Equation (|1.8[) ) also admits this property. 

For any phantom probability measure V, one sees that always 

V : E — ► A, 
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cf. Axiom |2~21 (iv). Therefore, the target of a phantom probability measure always lies in A. Given a fixed 
event A, we use the Gothic letter p to denote the image V (A) = p + p q of A, indicating that the corresponding 
phantom number belongs to A, and thus stands for a phantom probability value. 

Note that a phantom probability measure is notated by a calligraphic letter, while a standard measure is 
notated by a capital letter. 

To avoid nonzero annihilators, in the sequel exposition, we usually restrict the target of the phantom 
probability measure to the set 

A = { p £ A : pis not a zero divisor }, 

which we call the restricted phantom probability zone. (Note that £ A and the A is the topological 
closure of A.) In the remainder, unless otherwise specified, we always assume the probability values are in 
A, i.e. we exclude all the possible zero divisors in A. (In view of Remark 11.231 (ii), A is realized as the 
compactification of PH, while the elements of A are all pseudo nonnegative, cf. Definition 11.91 ) 

Lemma 2.4. Suppose p,p' G A, then: 

(i) (l-p)eA, 

(ii) pp' g A. 

Proof, (i) Let p = p + pq, then I — p = (I — p) + p (—(?)■ Clearly, as p G [0, 1], the real term (1 — p) G [0, 1]. 
The phantom term should satisfy 

-U-p)<-<z<i-(i-p) (=p), 

i.e. —p < q < (1 — p), but is given by the assumption that p 6 A. 

(ii) Let p = p + p q and p' = p' + p q'; then pp' = pp' + p (pq' + qp' + qq 1 ), clearly pp' G A rc . Using Axiom 
O(iv), write 

p(-p') + (-p)p' + {-p){-v') < pq' + qp' + m' 

< p(i-p') + (i- P )p' + (i-p)(i- P '), 

and expand to get 

p(-p') < pq' + qp' + qq' < l - pp', 

as desired. □ 

Given an element p € A, the element (I — p), also in A, is regarded as the phantom complement of p 
in A. 

Lemma 2.5. The image of a phantom probability measure V is well defined for phantom addition and multi- 
plication, that is V(A) + V(B), V{A)V(B), and T{A C ) are in A for any A, B G £. 

Proof. The addition is axiomatic, since AUBCE, implies V(A) + V(B) G A; cf. Axiom |2~21 (iv). 

For the multiplication, take p = V(A) and q = "P(-B) and apply Lemma [2^41 fii). Since V(A) G A, by Lemma 
ELl(i), V{A C ) = 1 -p is in A. □ 

Definition 2.6. A triple (f2, £,"P), where £ is a a-algebra of subsets of f2 and V is a phantom probability 
measure, is called a phantom probability space. Given an event A G E, V(A) is said to be the phantom 
probability of A. 

As mentioned earlier, in the context of phantom probability spaces, the phantom term should be realized 
as a signed bounded distortion, with respect to each event, dispersed non-uniformly over the probability space 
and it has total sum 0. Accordingly, the phantom probability measure can be understand as a family of real 
probability measures Pt : £ — > [0, 1], each satisfying 

(2.3) P t (A)e[V le (A),V Te (A)+P ph (A)], for any A G E, 

(or P t {A) G [Vr e (A) +V P h(A),Vr e (A)], when V ph (A) is negative). 

We say that a phantom probability measure V' : E — > PH, agree with V : E — > PH if it satisfies, 

\P„{A),P n (A) +V ph (A)] C [P re (A),r re (A) +VMA)], for any A G E. 

A real phantom probability measure P : E — > R, is said to agrees with V : E — > PH if it satisfies Equation (|2.3[) . 
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This argument provides the basis for Axiom [2~2l (iv): the sum of a probability and a distortion can not 
exceed the probability of a whole f2 and is never negative, since otherwise it would violate the standard laws 
of probability. 

Remark 2.7. In light of the previous paragraph, the former notions obtain the following special meaning: 

(1) A probability P(A) — p + pq and its conjugate (p — q) — pq resemble the same likelihoods (in the 
usual sense) lying between p and p + q. This is one of the reasons for specifying a norm in which 
\T(A)\ = \V{A)\. 

When one wants to dismiss this similarity and to have a unique canonical representative, he can 
use the correspondence q ~~> [0, q), and therefore p + pq is understood as {P G [p,p + q)}. (The same 
setting can be used for sample spaces as well.) 

(2) Zero divisors in the image ofV, if they exist, correspond to events whose likelihood might be equal 0, 
i.e. V(A) =0 + pq orV(A) ~p — pp, withp,q > 0; cf. Proposition ] 1 .J\ 

An exclusive case is when nothing is known about the likelihood of an event A; this scenario is 
recorded by V(A) = + pi. 

(3) Fixing V p h := for the phantom component of V , one gets the standard probability model. 

(4) Cases in which two probabilities V(A) and V(B) are both phantom numbers but their sum is real might 
happen and mean that the probability of A U B is fixed, but the probability of the interior subdivision 
is uncertain. 

Remark 2.8. Any phantom probability measure V : £ — > PH is associated neutrally with a (real) reduced 
probability measure 

P:E— ►[0,1], 

given by sending each A G £ to V(A). Since V(A C ) = 1 — V(A) for each A G £ and P(f2) = 1, it is easy to 
verify that V is a proper standard real probability measure. 

We recall that the real component, V YC : £ — ► [0, 1], of the phantom measure V is a proper standard real 
probability measure as well. 

As will be seen in the sequel, this reduced probability measure plays a crucial role in our future development. 

2.2. Digression. In view of Subsection 11.21 the phantom probability measure V : £ — ► PH, is a certain case 
of a phantom measure with probability zone A C PH(!)(R) of order 1. The general case is given with the 
phantom probability measure 

V : £ — ► PH(„)(R), 
of order n, and the following generalization of Axiom |2~21 (iv): 

(iv) -P re (A) < J2\=i Tt{A) < 1 - "Pre (A) for each A G E and i = 1, . . . , n, 
where Vi denotes the phantom component of V of level I. 

Generalizing our definitions appropriately, most of the following theory extends smoothly to phantom mea- 
sures of order n. 

2.3. Elementary properties of phantom probability. 

Proposition 2.9 (Basuc properties of phantom probability I). Given a phantom probability measure 
V , the following properties are satisfied for each A and B in E: 

(1) Pp h (fi) = 0, 

(2) 7>(0) = O, i.e. P ph (0)=Pre(0) = O J 

(3) -1 < Pph(A) < 1 for for each A G E, 

(4) 0< \V(A)\ < 1, 

(5) V(A C ) = 1-V(A), 

(6) V{A U B) = V{A) + V(B) - V(A n B), 

(7) P re (A) < V Te (B) ifACBCfl, 

(8) P rc (A U B) < Pre (A) + V Ie (B). 

Proof. 
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(1) By definition, cf. Axiom O (ii) . 

(2) By Axiom |2J2J(iii), P(fiU0) = P(ft)+P(0). Thus, by Axiom OH (i), P re (ft) +P re (0) = 1+P re (0) < 1, 
namely P re (0) = 0. On the other hand, by property (1), P ph (0) + P ph (0) = + P ph (0) = 0, so 
P P h(0) = 0. 

(3) Immediate by Axiom [2T21 (i) and Axiom [2T21 (iii), 

(4) Let V(A) = p + pq, then \p + pq\ 2 = p 2 +pq+ Thus, since q<l—p and < p < 1, 

P +P5+y<P +P(1-P) + g = ~~ ~ -~ ~ ' 

On the other hand, since q > —p, p 2 + pq + %^ > p 2 + p(—p) + = \ > 0. 

(5) Straightforward from property (1). 

(6) Write A U B = A U (A c n B) and B = (An B) U (i c nB). The additivity axiom yields 

P(A I) B) = P(A) + P(A C n B) and P(B) = P(A n B) + P(A C n B). 

Subtracting the second equality from the first and rearranging terms, we obtain the required. 

(7) and (8) are precisely the well known relations for the real probability measure P le . 

□ 

One of the reasons for defining the absolute value as in Equation (|1.18|) is that G A is the unique element 
with \z\ =0 and 1 is the unique clement with \z\ = 1. The cquiv-norm elements in A are a restriction of 
ellipses centered around the origin, to the first quadrant( see FigurcQ]). 

The same reason also led to defining the relation < Q as in Equation p. lip , since then for each z G A we 
have < [z] a < 1, where and 1 are obtained uniquely, i.e. [1] Q = 1 and [0] Q = 0. Thus, their equivalent 
classes in A are singletons, and they are all the singletons in A/ a . All the equivalent classes are parallel line 
segments having a slope = —a. 

Next, we plug in the given weak order ^> w k on PH; note that this weak order assumes satisfying the condition 
of Remark 12.31 

Proposition 2.10 (Elementary properties of phantom probability II). For any phantom probability 
measure V , the following properties are satisfied for each A and B in E: 

(1) Sk V{A) ^ wk 1 for each 4eS, 

(2) V(A) V(B) whenever AC BCD,, 

(3) V(A) ^ w i( V(AU B), for any pair of disjoint events A and B in E ; 

(4) V(AUB) ^ wk r(A)+V(B). 

Proof. 

(1) V(A) G A in which ^ wk z ^ w k 1 for each z G A, cf. Remark 1231 

(2) Write B = A U C. So, by Axiom [2JJ (iii), V(A UC*)= V(A) + V{C). But V{C) ^ wk by property 
(1), and hence V(A) + V(C) ^ wk V(A). 

(3) Immediate by property (2). 

(4) Combine property (1) and Proposition ^. 91 (6). 

□ 

Proposition 2.11 (Compound phantom probability measure). Suppose V%, i = 1, . . . ,m, are phantom 
probability measures and z, € A are phantom numbers Zi = a.; + p b{ such that ^ . zi = 1 . Then V = z(P\ 
is also a legitimate phantom probability measure. 

Proof. We need to verify the axioms of being a phantom probability measure; cf. Axiom [2~2l 

(i) For a fixed A G E, let V mia , le (A) = min{P M . c (A)} and let P max>Ie (A) = max{Vi, n (A)}. Then, 

< V min,rc 

(A)J2a t < ^afPi^A) < P ma x,re( A ) Qi - L 

i i i 

The fact that each Z{ is in A insures that < flj < 1, and thus a{Pi(A) G [0, 1]. 
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(ii) p(n) = Ei«iW) = Ei««i = i- 

(iii) By the additivity of each Vi, since A and B are assumed to be disjoint, write V(A\J B) = Ej z{Pi{AVJ 
B) = E* * PM^) + W)) = + Ei ZiPi{B) = V(A) + V(B). 

(iv) To prove that V p h(A) < 1 — V Te (A), we expand 

Pph(^) = Ei (Oi + W,ph(4) + hVi,re(A)) 

< Ei (0* + - ^,rc(A)) + hVi, Ie (A)) 

= EiK + ^)-Ei a ^ve(^) 
= i-p re (A). 

The same argument shows that — V Te (A) < V p h(A). 

□ 



Corollary 2.12. The space of phantom probability measures on a a-algebra £ is closed under an action of 
probability measures. That is, given a family of phantom probability measures Vi : £ — > A, i = 1, . . . , m, and 
a phantom probability measure Q : {{1}, ■ ■ • , {"^}} —* A, then V , defined as 

i 

is also a phantom probability measure. 

2.4. Initial examples. The following examples are presented mainly to demonstrate how nonstandard prob- 
lems are formulated naturally using phantom probability models. Later we show the phantom analogues to 
well-known probability distributions. 

We start with an example whose sample space is also a sample space in the usual sense. 

Example 2.13. Consider an unfair coin whose probability P*(H) to get head (in a single experiment) is 
unfixed, but belongs to the interval [0.4,0.6]. Accordingly, for any possibility of Pt(H), the probability P*(T) to 
get tail must satisfy -Pt(H) + P*(T) = 1, and thus is also restricted to the interval [0.4,0.6]. 
This situation is formulated phantomly by letting 

P(H) = 0.4+p0.2 and V(T) = 0.6 - p0.2. 

In this view, the real term o/7-'(H) + 7 7 (T) is constantly 1, while the distortion, which is at most 0.2, is encoded 
in the phantom terms ofV(H) and V(T). 

In classical probability theory the uniform probability is defined by assigning an identical probability to 
each event A in S, which recall is formulated as P(Ai) = i for a discrete model with £ = {Ai, . . . , A^}. This 
trivial formulation becomes meaningless in the phantom framework, since by Axiom [2T2l ('iv) the phantom term 
must be identically for each V{Ai). But, in view of Remark 12.71 (1), one can alternate between phantom 
probabilities and their conjugates, unless k is even, to have the sum of phantom terms equal 0. 

Next we consider an example with a phantom probability space. 

Example 2.14. Assume a financial investment with an expected profit of 5M$ up to 10A/$ in the case of 
success, which is estimated to have 40% — 60% likelihood and profit otherwise. Using a phantom probability 
space we formulate this investment with E = {{5M + p5Af},{0}} where the phantom probability measure 
V : £ — + PH is given by 

V : {5M + p5M} h-> 0.4 + p 0.2, V : {0} h-> 0.6- p 0.2. 

An exclusive case, very difficult to formulate using classical probability theory, is the following: 

Example 2.15. Assume a gambler who knows nothing about the chances of winning in a new roulette game. 
Denoting the event of wining and losing respectively by W and L, we define the phantom probability measure: 

Recall that these two phantom numbers are zero divisors in PH. 
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2.5. Conditional phantom probability and Bayes' rule. Analogously to classical theory, the conditional 

phantom probability of A, given a fixed conditioning event B, is denoted V(A\B) and defined as 

nm - ^ 

where V(B) assumed nonzero and a nonzero divisor. As a consequence, given two disjoint events A and B, 
where V(B) Z® iv , we have the equality: 

(2.4) V{A\B)V{B) = V(A n B). 

It can be verified that for a fixed event A, the conditional phantom probability forms a legitimate phantom 
probability law that satisfies Axiom [2~2l The fact that the phantom term of 7 , (^4|yl) is can be seen using 
Equation (|1.5|l . that is 

t>( a \ a\ p i PI -IP 1 , n 

V{A\A) = - + p -, , = 1 + pO, 

p p{p + q) 

for P(A) = p + pg. 

Assuming all the conditioning events have probabilities that are nonzero divisors and are ^ 0, applying 
Equation (|2.4[) recursively we have 

PintnAi) = V(A X ) V(A 2 \Ai) V{A 3 \A X nA 2 ) ■■■ V(A n \ n^ 1 AA. 

A sequence of events A\ , . . . , A n £ E is said to be a partition of f2 if each possible outcome is included 
in one and only one of the events Ai, . . . , A n . That is, the sample space f2 is the disjoint union of the events 
Ai, . . . , A n . 

Theorem 2.16 (Total probability theorem). Let A\, . . . , A n be disjoint events that form a partition of the 
sample space Q and assume that V(Ai) ^ is not a zero divisor, for all i = 1, . . . , n. Then, for any event B, 
we have 

V(B) = ^P(Bfli,) = ^V(Ai) V{B\Ai). 

i i 

Proof. The events A±, . . . , A n form a partition of the sample space f2, so the event B can be decomposed into 
the disjoint union of its intersections A; n B with the sets A4. Using the additivity axiom, Axiom [2"T2l (iii), it 
follows that V(B) = ^^(B n Ai). The proof is completed by the definition of conditional probability, i.e. 
V(B n Ai) = V{Ai)V{B\Ai). □ 

Theorem 2.17 (Bayes' rule). Let A\, . . . ,A n be disjoint events that form a partition of the sample space 
and assume that V(Ai) ^ is not a zero divisor, for all i = 1, . . . , n. Then, for any event B, we have 

VIA \B) = V{A) nBlA) = V{A) V(BlAl) 

{ tl ' V(B) V(A 1 )V(B\A 1 )+ ■■■ +V(A n )V(B\A n y 

Proof. To verify Bayes' rule, note that V (Ai)V (B\Ai) and V (Ai\B)V (B) are equal, because they are both 
equal to V(Ai DB). This yields the first equality. The second equality follows from the first by using the total 
probability theorem to rewrite V(B). □ 

Remark 2.18. In light of Theorems ] 2. 16] and \2.17\ the phantom probability space, as introduced in Definition 
\2.6] provides a Baysian probability model. Being a Baysian probability model is a crucial property of a 
theory of stochastic processes and Markov chains. The absence of this property is one of the main deficiencies 
in some alternative models that has been suggested in the past, cf. [TJ [71 118j . 

2.6. Independence. When the equality 

T(A\B) = V(A) 

holds, we say that the event A is (phantomly) independent of the event B. Note that by the definition 
V(A\B) = V(A (IB)/ V(B), this is equivalent to 

V(Ar\B) =V(A)V(B). 

We adopt this latter relation as the definition of independence because it can be used even if V(B) is a zero 
divisor or 0, in which case V(A\B) is undefined. 

The symmetry of this relation also implies that independence is a symmetric property; that is, if A is 
independent of B, then B is independent of A, and we can unambiguously say that A and B arc independent 
events. 
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We noted earlier that the conditional phantom probabilities of events, conditioned on a particular event, 
form a legitimate probability law. Thus, we can talk about phantom independence of various events with 
respect to this conditional law. In particular, given an event C, the events A and B are called conditionally 
independent if 

(2.5) V(A n B\C) = V(A\C) V{B\C). 

The definition of conditional probability and the multiplication rule yield 

T(AnB\C) = n f$ c) 

v(BnC)V(A\BnC) 

V(C) 

= V(C) v ( B\c)nA\Bnc) = v{B \ C )v(A\BnC), 

and thus, using Equation 12.51 

V(B\C)V(A\BDC) = V(A\C)V(B\C) 

After canceling the factor V(B\C), assumed nonzero divisor and ^ 0, we see that conditional independence 
is the same as the condition 

V(A\BnC)=V(A\C). 

In other words, this relation states that if C is known to have occurred, the additional knowledge that B also 
occurred does not change the probability of A, even though it is a phantom probability (understood as a varied 
probability in the classical sense) . Interestingly, like in the classical theory, independence of two events A and 
B with respect to the unconditional probability law, does not imply conditional independence, and vice versa. 

We generalize the definition of phantom independence to finitely many events, and say that the events 
A\, A2 , . . . , A n arc independent if 

v(n ies A l ) = l[v(A l ), 

i£S 

for every subset S of {1, 2, ... , n}. 



3. Phantom random variables 

In many probabilistic models, a random variable is a real- valued function A : Q — > IR of the outcomes of an 
experiment, which means that each outcome is assigned with a fixed single (real) numerical value. In real life 
this is far from being satisfactory; for example consider that these numerical values correspond to instrument 
readings or stock prices. Our module allows the assignment of a varied numerical value, recorded as a phantom 
number, to each outcome. 

Consider a random experiment with a sample space f2. A phantom random variable, written p. r. v. 
for short, 

X :fl — > PH 

is a single-phantom-valued function of the form 

(3.1) X :lu .— > x{t) =a x (t) + pb x (t), t e R, 

that assigns a phantom number x = X(ui), called the value of X, to each sample element w6 0. We write A re 
and A p h, respectively, for the real and the phantom components of X. In this realization, X is parameterized 
by real numbers, denoted by t; later we shall sec that this parametrization is either discrete or continuous. 

Note that the terminology which used here is the traditional terminology of probability theory, and for this 
reason we use the letter x, which stands for a x + pb x , to denote the phantom evaluation of A at w, while 
z stands for an arbitrary element in PH. (Clearly a p.r. v. is not a variable at all in the usual sense, but a 
function.) 

The sample space fi is called the domain of the p. r. v. X and is denoted D x . The collection of all phantom 
values of X(u>), where u> <G fl, is termed the phantom range, or just range, for short, of the p. r. v. X and 
is denoted by R x . Thus, the range R x of p. r. v. X(Q) is a certain subset of the set of all phantom numbers, 
usually assumed without zero divisors. 

Note that two or more different sample elements might give the same value of X(uj), but two different 
numbers in the range cannot be assigned to the same sample point. 
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Remark 3.1. When the real parametrization in Map (|3.ip is one-to-one, i.e. x{t\) ^ x(^) for any t\ 7^ t 2 , 
then the parametrization induces a total order on R x . We denote this order as < t . 



Clearly, any function g : R x — > PH of a p. r. v., i.e. a function whose domain contains the range of X, 
defines another p. r. v. . 

To any p. r. v. X we associate the reduced phantom random variable, written r. p. r. v. for short, 

X : n — > R 

given by 

(3.2) X :Lu^X IC (uj)+X ph (co), 

which in view of Map (|3.ip is X(lu) = a x (t) + b x (t), with t £ R. 

Remark 3.2. Along our next development we use the weak order ^ w k on PH, assumed to satisfy condition 
(12. 2|) . Accordingly the range of a p. r. v. is well ordered. (Recall that the main examples are < Q and <i | for a 
weak order, cf. Remark \2.3[ and the lexicographic order, cf. Equation (| 1 . 8[) for a total order.) 

We also remark that the range R x of any given p. r. v. X can be embedded in R, cf. Map (|3.1|) . and therefore, 
as pointed out earlier, can be parameterized by the real numbers. 

If X is a p. r. v. and z £ PH is a fixed phantom number, not necessarily in R x , we define the event (X = z) 
as the preimage of z, i.e. 

{X = z) = {lu £ fl : X{uj) = z}, 

which has probability ViX = z). Note that when z ^ R x , we set ViX = z) = 0. 
Similarly, for fixed numbers z, z\, and z 2 in PH, we define the following events: 



(x ~ wk z) = {wefi 

(x ^ wk z) = {lu eQ, 

(x ^ wk z) = {wen 

(zi ^ wk x ^ wk z 2 ) = {lu e 



X(lu) ~ wk z} 
X{u) ^< wk z} 
X(uj) ^wk z} 
zi < wk X(lu) ^ wk z 2 } 



which have respectively the phantom probabilities V(X ~ w k z), "P(X ^ w k z), V(X ^ w k z), and V(z\ -< w k 
-X" ^wk z 2 ). (We emphasize that these values need not be real numbers.) 

Note that when z ^ R x , we can still have ViX ~ w k z) 7^ 0. Of course this can only happen for a weak 
order; for a total order ViX = z) = for each z ^ R x . 

Given an arbitrary phantom number z and a p. r. v. X, with one-to-one real parametcrizations t, we define 
the function ^ x : PH — > R x by 

(3.3) £x( z ) : = max{max{x € R x : x ^ w k z}}, 

and sometimes write x z for £,x( z ) G ^x- This function is well defined unless R x = 0. Note that the interior 
max provides a set of elements in X which are in the same equivalence class, determined by ~ w k, while the 
exterior pick the maximal i-element in this class. 

In the same way we define the function £ : PH — ► R x as 

(3.4) £v( z ) := min{min{j; £ R x : x >; wk z}}, 

and write x z for (, x {z) £ R Y - As before, we set £ x (%) = £ f° r each a; £ R x . We emphasize thar the interior 
"min" and the "max" above are taken with respect to ^ w k - the weak order on PH, and the exteriors are 
taken with respect to <t - the total order on R x . The use of these functions is mainly for continuous p. r. v. 
as will be seen later. 

3.1. Discrete random variables. 

Definition 3.3. A p. r. v. X is called discrete if its range R x , i.e. the set of values that it can take, is finite 
or at most countably infinite. 



When X is discrete, we sometimes denote the values of X as Xx, x%, . . . , Xk, ■ ■ ■ , indicating that it is parame- 
terized by t G N. 

The most important way to characterize a random variable is through the (phantom) probabilities of the 
values that it can take. For a discrete random variable X, these values are captured by the probability mass 
function of X, written p. m. f. for short, and denoted p x . In particular, if x is any possible value of X, the 
probability mass of x, denoted p x (x), is the phantom probability of the event {X ~ w k x} consisting of all 
outcomes that give rise to a value of X equal to x. That is 

p x (x) = V(X = x), 

and therefore ^ w k p x (x) ^ w k 1, cf. Proposition ^. 101 (1). The mass function of X is extended to the whole 
PH by setting p x (z) := for each z ^ R x , and thus 

j V(X =z), z G R x ; 
10, z$R x . 



Px 0) 



By the axioms of the phantom probability measure, we therefore have 

^2p x (x) = 1, 

xGX 

where in the summation above, x ranges over all the possible numerical phantom values of X . This follows 
from the additivity and normalization axioms, because the events {X = x} are disjoint and form a partition of 
the sample space f2, as x ranges over all possible values of X . By a similar argument, for any set S of phantom 
numbers, we also have 

V(X€S) = Y /Px (x), 

where X G S means the values of X which are contained in S. (This notation it is a bit misleading, but is the 
traditional notation.) 

Let us summarize the properties of phantom mass functions: 

Properties 3.4. Properties of a p.m. f. p x : 

(1) 0^ wkPx (z) ^ wk 1, 

(2) p x (z) G A, and thus < \p x (z)\ < 1, 

(3) p x (z) = if z 7^ xx, X2, ■ ■ ■ , for all x. t G R Y , 

(4) E fc P*(a*) = l- 

In view of Remark |2. 8 1 with any p. m. f. p x we associate the reduced probability mass function, written 
r. p. m. f. for short, 

p(ies) = ^p;(4 

Given a phantom number, in particular a phantom probability p G A, by Defmition ll.181 we always have 
(3.5) f^p = 1 - P . 

Note that for each phantom number p G Awe have the inclusions p G [0, 1] and 1 — p G [0,1] in the real interval 
[0, 1]. We use this property in the forthcoming examples. 

The following examples demonstrate how well-known probability mass functions are generalized naturally in 
the phantom framework. Moreover, these examples show that most of these probabilities have the realization 
property. We keep the traditional notation and denote the parameter as A, though, here it takes phantom 
values. 

Example 3.5 ( The binomial p. r. v.). A biased coin with ambiguous probability is tossed n times. At each 
toss, the coin comes up a head with phantom probability p = p + p q, and a tail with phantom probability 
q = 1 — p , independently of prior tosses. 

Let X be the number of heads in the n-toss sequence. We refer to X as a binomial p. r. v. with parameters 
n G N and p G PH. The p. m. i. of X consists of the binomial probabilities: 

n > k-n-k 



p x (k)=V(X = k)= ( fc )pV , fc = 0,l,...,n. 



2.", 



(Note that here and elsewhere, we simplify notation and use k, instead ofxk, to denote the discrete values of 
integer-valued random variables.) 

The normalization property ^2 X P X ( X ) = 1> specialized to the binomial random variable, is written 

E(;)p l (i-r=i. 

To see that this property is satisfied, apply Equation (|1.13[) for p k (l — p) n , that is 

p k (l - p) n ~ k = p k (l - p) n - k + p ((p) fe (l - p) n ~ k - p k {l - p) n ' k ) . 
Recalling that X)fc=o (fc)p fe (l — p)(™~ fe ) = 1 for any real p € [0, 1], cf. [3j, and p € [0, 1], we take the sum 

£ (?y (i - pr- k + p e (") (p fe (i - p)"- fc - p fe (i - p) ,i ^ fe ) = i + p (i - 1) 

fc=0 ^ ' k=0 ^ ' 

to obtain the desired. 

Example 3.6 ( The geometric p. r. v.). The geometric p. r. v. is the number X of trials, each with phantom 
probability p, needed for success the first time. Its p.m.f. is given by 

Px (k) = (l-p) k - 1 p, k =1,2,.... 

This is a legitimate p.m.f.. Indeed, use Equation (|1 . 13[) to write 

Px {k) = ((1 -p) - pq) k -\p + pq) = {l-pf^p + p ((1 - P) k -\p) - (1 -pf^p) ■ 

Recalling that Y^k=i0- ~ P) 1 P = 1 /or any real p 6 [0, 1], we £afce £/ie STxm 

oo oo oo 

E (*) = E^ 1 - p) fe_1 p + p E (( x - - ( x - p) fe "^) = i + p (i - 1). 

fe=l k=l fe=l 

io gei i/ie required. 

Example 3.7 ( T/ie Poisson p. r. v.). ^4 Poisson p. r. v. takes nonnegative integer values. Its p. m. f. is given 
by 

\ k 

P x (k) = e- x — , fc = 0,l,2,..., 
where X is a pseudo positive phantom parameter characterizing the p.m.f.. It is a legitimate p.m.f. because 
00 \ k / X 2 A 3 \ 

fc=0 v 7 

The latter equality is by Provosition (3). 

3.2. Continuous random variables. The case when phantom random variables are continuous is much 
delicate than the discrete case, especially since paths are involved and their paramctrization needs to be 
included carefully in our formulation. 

Definition 3.8. A p.r. v. X is called continuous if its probability law can be described in terms of a piecewise 
continuous phantom function f x : PH — ► PH, called the probability density function of X, written p. d. f. 
for short, whose real component is nonnegative and which satisfies 

(3.6) V(XeS) = I f x (x)dx, 
for every subset S of 

(3.7) 7x = {X = a x {t) + pb x {t) I t e R}, 
where a x (t) and b x (t) are real piecewise differentiable functions. 
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In fact, we care only about f x restricted to 7 X , which is the range of X, on which f x is piecewise continuous. 
The set j x is realized as a path interval in PH = R x R, isomorphic to an interval in R, and it plays a main 
role in our exposition. 

Note that S does not need not be continuous. In such a case, assuming S compounds of countably many 
continuous subsets Si, the integral is translated to the sum of integrals over the Si, i.e. 

(3.8) f f x (x)dx = Y / I fxWdx 

Js Ss J s, 

where each Si is continuous and the S^s are pairwise disjoint. In the sequel, for simplicity, we assume S is a 
continuous subset of X, otherwise we apply the same consideration as (|3.8|) . 




♦ re 



Figure 2. (a) Illustration of the compactification of a path used in most applications, (b) 
The compactification of a path 7 X having self intersection points. 



Remark 3.9. (i) Since 7 X is a parameterized path in PH, i.e. x(t) ~ a x (t) + pb x (t) for each x € j x , 
the map R — ► 7 X is not necessarily one-to-one, and several reals may have the same image. In other 
words 7 X might have self-intersection points, see for example Figured (a). These cases require special 
treatment that is beyond the scope of this paper. Therefore, in the rest of this paper, when dealing with 
paths, they are always assumed to have no self-intersections. 

(ii) Having this assumption, i.e. r y x has no self-intersection points, given a point x = a x (t) + pb x (t) of 
7 X , for notional convenience, we write t(x) for the real t-value, determined by the parametrization of 
7 X , whose image is x. Moreover, as mentioned before (cf. Remark \3.1}) . the given parametrization 
also determines an order on y x , which we denote as <t, and write x\ < t Xi when t(x\) < t(x2). 

(iii) In view of Remark \1.23\ (ii), in most applications the image of a path r y x in the compactification of 
PH has the endpoints (0,0) and (1,0), as illustrated by Figure® (b), and usually does not have points 
with the same real term. However, we do not limit ourself to this type of path. 

(iv) Abusing the notation, as in the complex convention, in order to address the situation that a continuous 
p. r. v. X is provided as a path j x , parameterized by t € R ; we sometime write "f x (t) for x(t) and j' x (t) 
for the derivative x'(t) of x(t) with respect to t, which by Equation (|1.21| is just a'(t) + pb'(t). 

Since the integration of p. d. f.'s is performed along paths whose order does not need to be compatible with 
the weak order on PH, we distinguish between cases in which the integration scope is determined by points 
that belong to the path and cases when these points are arbitrary phantom points. We start with the former 
case, then we extend it the latter case. 

First, since we are dealing with an integral along a parameterized path, cf. Equation (|3.6[) . using Equation 
(I1.23P this integral can be written as 

V(X e S) = [ f x (x)dx = [ f x ( lx (t)W x (t)dt, 
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for any S C j x , assumed to be continuous. In particular, the probability that the value x of X falls within a 
path interval of 7 X , whose endpoints are x\ and x 2 , is 

(3.9) V(xi < t X < t x 2 ) = / /x(7x(*))7x(*) rf *> for cci,x 2 are on 7 X . 

Recall that by Equation (|1.24|) . this integral decomposes into two real integrals, one for the real component 
and the second for the phantom component. The evaluation of each component can be interpreted as the areas 
confined between 7 X and the graphs of the corresponding function. 
To simplify the notation, we define 

(3.10) fxM*) ■= fx(i x (t)h' x (t) 

and rewrite Equation (|3.9p as 

(3.11) V{ Xl < t X < t x 2 ) = [ ( ^ f xn (t)dt. 

Jt(x 1 ) 

(In fact 7 X is determined by X, but we use this notation to indicate that f x ,-y(t) depends on the path r y x in 
PH.) 

As in the discreet case, a p.d.f. f xl is associated with the reduced probability density function, 
written r. p. d. f. for short, defined as 

(3.12) V(xi < t X < t x 2 ) = [ ( ^ f xn {t)dt, 

by taking the integral over the sum of the real component and the phantom component of / x , 7 . 
As in the standard theory, for any single value x wc have 

V(X = x)= [ ( ] f x , 1 {t)dt = Q. 

Jt(x) 

Therefore, including or excluding the endpoints of an interval in 7 X has no effect on its probability: 

V(xi < t X < t x 2 ) = V{xx < t X < t x 2 ) = V{xt < t X < t a; 2 ) = < t A < t x 2 ), 

for any xi, X2 G 7 X . 

As usual, to be qualified as a p. d. f., /„ must satisfy the normalization property 



/ f x dx = f f xn (t)dt = 1 



where its real component must take only nonnegative values, i.e., re (f x (x)) > for every x £ j x . Accordingly, 
the function / Xl7 satisfies 

/ re(f XtJ (t))dt = l and f ph(f x „(tj)dt = 0; 

that is, normalization and vanishing of phantoms, respectively. 

Now, we turn to the cases in which the scopes of random variables are determined by arbitrary phantom 
values. To extend f x to the whole PH, we fix f x (z) :— for each z ^ j x and define 

(3.13) "P(zi ^ wk X < wk z 2 ) = I f x ^(t)dt, S = {x e 7 X : zi ^ wk a; ^ wk z 2 } 

Js 

for any z±, z 2 £ PH. In the case that S is not continuous, the integral is decomposed into the sum of countably 
many integrals along the path intervals as in (|3.8[) . taken with respect to <t in the positive direction. 
When S is continuous, Equation (|3 . 13f> has the form 

rr(I x (z 2 )) 

(3.14) V( Zl ^ wk X < wk z 2 ) = / f x ,-y(t)dt, 

■M? x ( Zl )) 

which is just a line integral along a path interval of 7 X . Recall that £ x (zi) and (, x ( z i)? cf. Equations (|3. 3|) and 
(13. 4p . provide the "top" point and the "bottom" point on 7 X closest to z 2 and z±, respectively; these points 
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are unique since < t is a total order applied on the weak order ^j w k- In this sense, we capture all the elements 
in R x which are less or ~ wk -equivalent to z 2 and greater or ~ wk -equivalent to Z\ . 
Accordingly, we also have, 

rr{J x (z)) 

(3.15) P(X~ wk z)= / f xn (t)dt, 

which is the integration along all elements in R x , assumed continuous, that arc ~ w k-cquivalcnt to z. In this 
view it is easy to see that we might have V(X ~ wk z) ^= 0. 

Let us outline some basic properties of probability density functions: 

Properties 3.10. Properties ofp.d.f. f x : 

(1) ^ wk f x (z) ^ wk 1, for each z € PH, 

(2) f x (z) 6 A, and thus < \f x (z)\ < 1, 

(3) f x (z)=0ifz<f lx . 

In most applications, the random variable is either discrete or continuous, but if the p. d. f. of a p. r. v. X 
possesses features of both discrete and continuous random variable's, then the random variable X is called a 
mixed phantom random variable. 



3.3. Cumulative phantom distribution function. The cumulative phantom distribution function, 

written c. p. d. f. for short, F x of a p. r. v. X provides the probability V{X ~^ wk z), i.e. 



(3.16) F x (z)=V(X < wk z) 



Px Ofc): x discrete; 
Jg fx.~f(t)dt, A continuous. 



for every z in PH. Here, S is defined as {x e "f x : x ^ wk z}, cf. Equation (|3 . 1 3[) with z\ tending to — oo, 
where — oo stands for — oo(l + p). 

Loosely speaking, the c.p.d.f. F x (z) "accumulates" phantom probability up to the phantom value z. As 
in classical theory, most of the information about a random experiment described by the p. r. v. X is recorded 
by the behavior of F x (z). 

Using Equation (|3 . 13[) . if A is a continuous random variable, then 



V(zt ^ wk A^ wk z 2 )= / f xn {t)dt = F x {z 2 )-F x {z 
Js 



(The discrete analogue is obvious.) To emphasize, although F x gets an argument that is a phantom number, 
it also depends on the parametrization of A. 

Properties 3.11. Writing z — > oo ; for z = a + pb with a — > oo and b — > oo, we have the following properties: 

(1) 0^ wk F x {z) Sk 1, 

(2) Q<\F x {z)\<\, 

(3) F x (zi) < wk F x (z 2 ) if zi ^wk z 2 , 

(4) lim F x (z) = F x (oo) = l, 

Z—>OQ 

(5) lim F x (z) = F x (-oo) = 0, 

x — » — oo 

(6) lim F x (z) = F x (Zq ), where Zq = lim z^ + e. 

z^2+ 0<|e|-»0 

(The verification of these properties is straightforward.) 

Accordingly, having Properties 13.111 one can also compute other probabilities, such as 



(3.17) 



T(X^ wkZl ) = l-F x ( Zl ), 

V(X -< wk z 2 ) = F x (z^), where = lim {z% — e). 
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3.4. Moments and variance. In the sequel, we use the notation E[ ], Var[ ], and Cov[ ] respectively for 
moments, variance, and covariance. These notations are used for both the phantom sense and the standard 
sense (applied for real numbers) of the respective probability functions, where the meaning is understood from 
the context. We also point out that for these functions, and others, the standard form ia always captured 
in the real component of the function. As will be seen, this attribute is provided for free by the phantom 
structure, and one should keep in mind that the generalization to the phantom framework is performed only 
through the phantom terms of the arguments. This is the leading idea of our forthcoming exposition. 

Definition 3.12. The n 'th moment of a p. r. v. X is defined by 

{^j./p^fi), X discrete; 

J 7 (7xw)"/x,7W^! X continuous. 
The first moment EfX 1 ] is called the mean, or the expected value, of X and is denoted by [i x . 

The n'th reduced moment E[(X) n ] of a p.r. v. X is the standard moment for reals, applied to X with 

p~x, or f xn , for a discrete or a continuous X, respectively. In the same way we define the n'th conjugate 

moment E[(A)™] of X, by taking X, computed with respect top^, or / x , 7 ; clearly this is a phantom function. 

We write E re [A"™] and E p h[X"], respectively, for the real term and the phantom term of E[X n ] and therefore 
have 

E re [X™] = E[X re n ], 

where E[Jf re n ] is taken with respect to the real component of p x or / Xl7 . (This relation is not satisfied for the 
phantom component, i.e. E p h[X™] ^ E[A" p h n ], since it also involves the real term of the arguments.) 
When X is discrete, using Equation (| 1 . 13|) . we write 

= J2 X a xPx + P {{a x + b x ) n (p x + q x ) - a%p x ), 

= Ex <Vx + P (x n px ~ a"p x ), 

for x — a x + pb x and p x (x) = p x = p x + pq x \ a similar form is also obtained for a continuous X . Accordingly, 
the phantom moment satisfies the realization property, and Equation (|3.18[) gets the following friendly form: 

(3.19) E[A n ] = E[X rc "] + p(E[A > "]-E[A ro ™]), 

where E[X re n ] stands for ^p x a x and E[X n ] stands for J^pi™. 



Proposition 3.13. E[X n ] = E[A ], where E[A ] is computed with respect to the conjugates of X and the 
probability measure. 



Proof. Straightforward by the additivity and the multiplicativity of the phantom conjugate, cf. Properties [TTTJ 

□ 

The variance of a p. r. v. X, denoted by a 2 or Var[X], is defined as 
(3.20) a\ = E[(A-^ Y ) 2 ] 

and thus 

J2 X ( X " A i A-) 2 Px( x )i X 1S discrete; 

L (lx (*) - M x ) 2 / X ,7(*) di : x is continuous. 



(3.21) a 
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As usual, Var re [X] and Var p h[X] denote respectively the real term and the phantom term of Var[X], where 
Var re [X] = Var[A re ] are taken with respect to the real component of the probability measure. Thus, we always 
have Var rc [A] > 0, since it is just a standard (real) variance. Moreover, we have the following property: 

Proposition 3.14. The variance Var[A] is pseudo nonnegative, for any p.r. v. X. 

Proof. In view of Equation (|3.21[) . since the square of a phantom number and probabilities are pseudo non- 
negative, then the proof is completed by Lemma \1 . 1 01 applied for the sum of the products. □ 
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The reduced variance, i.e. a real-valued function, is defined as 

Var[X] = E[(X-E[X]) 2 ]. 

Since this is a standard (real) variance, then we always have Var[X] > 0. The conjugate variance is defined 
similarly as Var[X] = E[(X — E[X]) 2 ], i.e. taken with respect to the conjugates of X and the probability 
measure. 

Expanding the right-hand side of Equation (|3.20j) . wc obtain the following relation: 

(3.22) Varpf] = E[X 2 ] - E[X] 2 , 

which is a useful formula for determining the variance. Plugging Equation (|3. 19|) in to this form, and simpli- 
fying, one obtains the realization property for variance: 

(3.23) Var[X] = Var[X re ] + p (Var[X] - Var[X re ] 

We recall that the notation Var[X] is used for both the phantom and the standard variance, where the meaning 
is understood from the context. 

Remark 3.15. Both E[X n ] and V&r[X] are phantom functions whose real components satisfy the familiar 
properties of n 'th moment and variance. 

Assuming g(X) is a phantom function of a p. r. v. X, the expected value of the p. r. v. g(X) is given by 

( J2 x 9{x)p x (x), X discrete; 

E[g(X)] = < r 

{ ) lx 9{l x { t ))fx,ii t ) dt i x continuous. 

It is straightforward to verify that when g is a linear phantom function, say g(X) = aX + (3, with a, [3 £ PH, 
then 

(3.24) E[g(X)} = aE[X] + [3 and Var[g(X)\ = a 2 Var[X], 
the latter formula is obtained by (|3.21[) . 



Proposition 3.16. Var[X] = Var[X], where Var[X ] is computed with respect to the conjugates of X and the 
probability measure. 



Proof. Use Equation ([3~2"2"]) and Properties O to write Var[X] = E[X 2 ] - E[X] , which by Proposition l3~T3l is 
E[X2] - E[X] 2 and again by Properties 0E[I 2 ] - E[X] 2 , that is VarpT]. □ 

Having this property of Proposition 13.141 we attained the following additional phantom analog: 

Definition 3.17. The standard phantom deviation a x of a p.r. v. X, is defined to be the maximal non- 
negative phantom square root of Var[X] ; cf. Equation (|1.16[) . i.e. the root with the nonnegative real term and 
the maximal nonnegative phantom term. 

Since each phantom number has a nonnegative square root, cf. Equation (|1.16|) . and Var[X] is pseudo 
nonnegative, then the standard phantom deviation is well defined for any p.r. v.. Using Equation (|1.16[) it is 
easy to see that the standard phantom deviation also admits the realization property, i.e. 

°x = o- Xre +P(v x - - °- Xr J- 

3.5. Special examples. The following examples show how the 

classical mean and variance carry naturally on to the phantom framework. When a p. r. v. is discrete, we 
can retain the exacrt classical setting, while the continuous cases require a modification of definitions which 
involves the paramctrization of X, i.e. that of j x . Yet, the standard (real) distributions are received as the 
private cases for the phantom ones. 

Example 3.18 (Mean and variance of the Bernoulli). Consider the experiment of tossing a biased coin, 
which comes up a head with phantom probability p and a tail with probability 1 — p. and the Bernoulli p.r. v. 
X with p. m. f. 

(p. k= 1; 

-{!-,, *=<>: 

Then E[X] = lp + 0(1 -p) = p, E\X 2 } = l 2 p + 2 (1 - p) = p and thus Var[X] = E[X 2 ] - E[X] 2 = p - p 2 . 
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Example 3.19 ( The mean of the Poisson). The mean of the Poisson p. m. f. with pseudo positive parameter 
X e PH 

X k 

Px (k) = e.- x — , k = 0,1,2,- •• , 

can be calculated as follows: 



°° \k 00 \k 00 \fe-l 



fc! ^ k\ ^ k\ 

k=0 k=l k=Q 

(* the component indexed k = is zero.) A similar calculation shows that the phantom variance of a Poisson 
random variable is also X. 

Example 3.20 ( The phantom exponential p. r. v.). Let x(t) = a{t) + p b(t). We write a := a(t), b := b(t), 
and x := x{t), for short. The notation x' stands for the derivative of x with respect to t, and thus x' = a' + pb' . 
A phantom exponential p. r. v. has a p. d. f. with the form 



/*(*) 



■p-e Xx , x is pseudo positive; 
0, otherwise , 



where A £ PH is a pseudo positive parameter. In particular A 7^ and is not a zero divisor in PH. 

One observes that when X is a real random variable, i.e x(t) := t, then x' = 1 and the phantom exponential 
f x collapses to the known exponential random variable. 

Using Equation (|1.2[) and Equation (|1.21|) . respectively, we have 

xX = aX Ie + p (xX - aX le ), and — = + p I — 7 . 

x' a' y x ' a' J 

Then, Equation (|1.17| yields 



aX ™ +p{e- xX - e - aX ™) 

A^. e -aA re + p I X_ e -x\ _ A^ e -aA r 



Xx 




V P 








t 










VP 


I 













Now, for f xa = f x i x we get 

f xn = X e~ Xx x' = X, c e- aX ™ + p (Xer ix ~ K c e~ aX < 

which shows that that / Xj7 satisfies the normalization property. This because each component is by itself a real 
exponential, and thus the real component is 1 and the phantom component is summed up to 0. 
A similar computation as before shows that 



rf xrt = aA rc e~ aAro + p (xXe~ xX - aA ro e~ aAr ^ 



Recalling that for a real exponential random variable, E[X] = 1/A and V&r[X] = l/X 2 , cf. [3], taking the 
integral 

— + (-- — 

Arc \ A Arc 

we get the mean of exponential p. r. v. in terms of the phantom parameter X as 



E(X) = I x{t)f xn (t)dt = — + p{-- — 







cf. Equation (|1.4[) . 

In fact we could also have obtained this relation in a shorter way by using Equation (|3.19[) . but, for the 
matter of validation, we have presented the detailed computation. 
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3.6. Normal random variables. A continuous p.r.v. X is said to be phantom normal, or phantom 
Gaussian, if it has a p. d. f. of the form 

axWZn 



where \x and a are two phantom scalar parameters characterizing the p. d. £, with a assumed pseudo positive. 
For simplicity, we also assume x' differentiable, and write x' for the derivative of x := x(t) = a(t) + p b(t) with 
respect to t. 

Note that in comparison to the classical case the normal p. r. v. includes the extra argument x' in the 
denominator. Yet, as we had for the exponential p.r.v., when X is assumed to take only real values, the 
phantom normal density function collapses to the classical normal density function. 

Proposition 3.21. f x (x) satisfies the the normalization property 

(3.25) [ Ije-W^dx = 1. 

(TV 2ir J-, x' 



where 7 X is parameterized by t £ R, assumed differentiable. 

Proof. Let w = x — [X, and therefore w' — x' . Then, (x — ^) 2 /er 2 = w 2 fa 2 , and using simple computation one 
can verify that 

2/2 2/2, / -2 I - 2 2 I 2 \ 

Plugging this into e~\ x ~v) I" 1 " and using Equation (|1.17|) . we have 

e -(x- f i) 2 /2a 2 = e -w 2 /2a 2 = e -u, 2 J2a 2 + p ^ £ -w 2 /2<r 2 _ & - w 2 j2a 2 ^ _ 

Thus, 



1 -w 2 /2a 2 1 -w 2 /2a 2 , , ( 1 -w 2 /2a 2 1 -w 2 /2c 

— e> = — e " + p \—e 1 -e ™' 

x' a' \x' a' 

But then, ^e~ w2 / 2a2 i x is just e^/ 2 "^ + p (er™ 2 1 2 ° 2 - er^J^r 

Recalling that — = — p = — + p(i — ) and integrating this in (|3.25[) , given in parametric form, 

we get 

(3.26) — L= / e~<^dt + p( [ e^ 2 / 2&2 dt \= [ e^^dt) = 1 + p (1 - 1), 

(TrcV 27T J \o\J2'k J (T ic \/2ttJ / 

since the phantom component is the sum of two standard normal distributions, each equal to 1. □ 

Equation (|3.26|) shows that the phantom normal p. d. f. admits the realization property. 

Proposition 3.22. The mean and the variance of a normal p.r. v. X with phantom parameters \i and a are 

E[X]=n and Var[A] = a 2 . 

Proof. Consider the realization property of Equation (|3.26|1 combined respectively, with Equation (|3.19|) and 
Equation (|333]l . □ 

Theorem 3.23. Normality is preserved under linear transformations. If X is a normal p.r.v. with mean \x 
and variance a 2 , and if a, (3 € PH are phantom scalars, then the p. r.v.Y = aX + (3 is also normal, with mean 
and variance 

E[Y] = an + (3, Var[F] = a 2 a 2 . 

Proof. Immediate by Proposition 13.221 and Equation (|3.24|) . □ 

A normal random variable Y with zero mean and unit variance is said to be a standard phantom normal. 
Its c.p. d.f., denoted as $, is given by 

(3-27) *(z) = V(Y< wk z) = -L / ^e~y 2 / 2 dy, 

v2^ Js y 

where S = {y G j Y : y ^j w k z}, assumed continuous and differentiable. Clearly, this integral can also be 
written in the parametric form as given in Equation (13.141) . 
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Let X be a normal p. r. v. with mean fi x and variance o~ x . We "standardize" X by denning a new random 
variable Y given by 

Y = — My 

Since Y is a linear transformation of X , it is normal. Furthermore, 

E[ y] = ffl^ = o, var[y] = y*W=i. 

a x G x 

Thus, Y is a standard normal p. r. v.. This fact allows us to calculate the probability of any event defined in 
terms of X: we redefine the event in terms of Y , and then use the standard normal p. r. v.. 

The (classical) normal random variable plays an important role in a broad range of probabilistic models. 
The main reason is that, generally speaking, it models well the additive effect of many independent factors, 
in a variety of engineering, physical, and statistical contexts. As we have shown the normal p. r. v. preserves 
this property and generalizes the classical one in a natural way. 

Mathematically, the key fact is that the sum of a large number of independent and identically distributed 
(not necessarily normal) phantom random variables has an approximately normal c.p. d. £, regardless of the 
c. p. d. f. of the individual random variables. This property is captured in the celebrated central limit theorem, 
extended to the phantom framework, which will be discussed in Section [6] 

4. Multiple random variable 

Consider a random experience having the sample space f2. A multiple phantom random variable, 
written m.p.r. v. for short, is a multiple-phantom-valued function 

{X 1 ,...,X n )-M — ► PH (n) , 

given by 

(4.1) {X u ...,X n ):u; .— ► (X^u), . . . ,X n (w)), 

with each X^ a p. r. v. on f2 as in Equation (|3.1[) . The phantom range of the m. p. r. v. (A 1? . . . , X n ) is denoted 
by R Xj Xn , and defined by 

R Xl ,...,x„ = {fall ■ ■ ■ > x n) ■ & € x l = Xi((j), ... , X n = X n ((j)} . 

If the Xi's are each, by themselves, discrete p. r. v.'s, then (Xi, . . . , X n ) is called a discrete m.p.r. v.. 
Similarly, if the X^s are each, by themselves, continuous p.r. v.'s, then (X\, . . . , X n ) is called a continuous 
m.p.r. v.. When n = 2 we write (X,Y) for {X\, X2) and call it a bivariate phantom random variable, 
written h.p.r. v. for short. In the remainder of this section, to make the exposition clearer, we present the 
case of b.p.r. v.; the extension to m.p.r. v. is straightforward. 

Consider two discrete p.r. v.'s X and Y associated with the same experiment. The joint phantom mass 
function of X and Y is defined by 

Px,y( x >v) = v ( x = x > Y = y) 

for all pairs of phantom numerical values (x, y) that X and Y can take; otherwise it equals zero. (Here 
and elsewhere, we will use the abbreviated notation V(X = x,Y = y) instead of the more precise notation 
V({X = x}n{Y = y}).) 

The joint p. m. i. determines the probability of any event that can be specified in terms of the p. r. v.'s X 
and Y . For example, if A is the set of all pairs (x, y) that have a certain property, then 

V{{x,y) e A) = Yl Px,A x >v)- 

(x.y)eA 

In fact, as in classical theory, we can calculate the p. m. f.'s of X and Y by using the formulas 

Px( x ) = ^2px,y{x,v), P Y {y) = J2p x _ Y (x,y), 
y % 
where x and y range respectively over all the phantom values of X and Y. 

Definition 4.1. Two discrete p.r. v. 's X and Y are said to be independent if 

Px,y( x >v) = Px( x )Py(v)> forallx,y. 
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V((X,Y)eB) = f f fx,Y(x,y)dxfy 



We say that two continuous p.r. v.'s associated with a common experiment are jointly continuous, and 
can be described in terms of a joint p. d. f. f XY , if f XY is a continuous function whose real component is 
nonnegative and that satisfies 

(4.2) V((X,Y)eB) = J J f XY (x,y)dxfy 

B 

for every subset B of 

ix x i Y = {(x,y) ■ xej x , yej Y }, 

where j x and j Y are as in Equation (|3.7p . 

Accordingly, given z 1 ,z 2 ,t«i,W2 € PDH, we define 

S x = {i(E 7 X : 21 ^wk x ^ wk z 2 }, S Y = {y e 7 X : w\ ^ wk y ^ w k u> 2 }, 

and have 

V{zi ^ wk X ^< wk z 2 , wi ^wk V ^wk w 2 ) = I \ f x y {x, y)dxfy. 

Js Y Js x 

(Note that when S x or S Y are not connected, the integral decomposes into a sum of integrals, assumed finitely 
many.) 

Furthermore, by letting B in Equation (|4.2[) be the entire set 7^x7^, we obtain the normalization property 

1. 

As before, when S x and S Y arc subpaths of j x and 7^ , respectively, we can use the orders < t and < s induced 
on 7 X and 7 V by their parametrization, respectively, together with the integral form (|1.23[) . and write: 

V{xi < t X < t x 2 , y x < s Y < s ift,) = f {V ' ] f ( 2 / x , y (7 x (*),7v(s))7x(*)7;(s)*^- 

Jr(yi) Jt(xi) 

for 2:1,0:2 G 7x7 2/1)2/2 S 7v 

The marginal p. d. f. 's / x and /y of A and F, respectively, are given by: 

fxi x )= f x ,y{ x ^y) d y^ and fv(y)= f XY {x,y)dx. 
h Y h x 

In full analogy with the discrete case, we say that two continuous p.r. v.'s X and Y are independent if 
their joint p. d. f. is the product of their marginal p. d. f.'s: 

fx,Y( x >v) = fx( x )fy(y)> for all a:, y. 
By simple computation one can verify that: 

Properties 4.2. If X and Y are independent p. r. v.'s then: 

(1) The p. r. v. 's g(X) and h(Y) are independent, for any functions g and h, 

(2) E[ Zl A + z 2 Y + z 3 ] = zi E[A] + z 2 E[Y] + z 3 , for z, e PH, 

(3) E[XY] = E[A] E[y], and more generally E[g(X)h(Y)\ = E[g(X)] E[h(Y)], 

(4) Var[A + Y}= Var[A] + Var[Y}. 

These properties can be verified easily by direct computation or by using the relaxation property admitted 
by E[ ] and Var[ ] and validation of these property for the standard (real) cases. 

4.1. Covariance and correlation. The covariance, denoted by Cov[X, Y], of two p.r. v.'s X and Y is 
defined as 

(4.3) Cov[A,F] = E[ (X -E[X])(Y -E[Y]) ]. 

The p. r. v.'s X and Y are said to be uncorrelated if Cov[A, Y] — 0. When Cov re [A, Y] = we say that X 
and Y are real uncorrelated, and if Cov p h[A, Y] = we say that X and Y are phantomly uncorrelated. 
We let 

Cov[A,r] = E[ (A -E[A]) (?-E[?])], 

and call it the (real) reduced covariance of A and Y, where expectations are computed with respect to the 
reductions of A and the probability measure. 
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LetU = X - E[X] and V = Y - E[Y], then 

Cov[X, Y] = E[U re V ie + p(UV- U ie V Te )\, 
cf. Equation (pTT5|) . which by Equation (pill?)) is 

Cov[X,Y] = E[[/ rc U rc ] + p(E[UV] - E[U Ie V ie }). 

In other words, 

(4.4) Cov[X, Y] = Cov[X rc , r rc ] + p (Cov[X, Y] - Cov[X ie , Y ie }), 

that is the phantom covariance admits the realization property. 

Roughly speaking, positive or negative parts (cf. Definition ll.9[) of covariance indicate that the values of 
X — E[X] and Y — E[Y] obtained in a single experiment tend to have the same or the opposite sign, respectively. 
Thus, the signs of the real and the phantom term of the covariance provide an important qualitative indicator 
of the relation between the real components and the phantom components of X and Y. If X and Y are 
independent, then 

Cov[X,Y] = E[ (X -E[X])(Y-E[Y)) } 

= E[X - E[X]\ E[Y - E[Y]] = 0. 

Therefore, if X and Y are independent, they are also uncorrelated. However, as in classical theory, the reverse 
is not true. 

The correlation coefficient p(X, Y) of two p.r.v.'s X and Y, whose variances are nonzero divisors, is 
defined as 

(«) P ix X ) = . 

v/Var[X] Var[y] 

This maybe viewed as a normalized version of the phantom covariance Cov[X, Y], and as the computation 
below shows, the real term of p(X, Y) ranges from —1 to 1. 

Using Equation (|3.23[) and Equation (|4.4p . together with the realization properties of the square root (|1.16p . 
Equation ()4.5|) receives the familiar form: 

/-y y\ COV [Xrc Y ro ] + p (CoV [XY } - CoV [ X rc Y ro ] ) 

V 7 ( Var [XZ ] + p ( Var [X] - Var [X IC ] ) ) V ( Var [Y ro ] + p ( Var [Y] - Var [y rc ] ) ) 

COV [ Xrg Yro ] + P (COV [ XY] - CoV [Arc Y rc ] ) 



(y/VmiX rc ] + p ( V Var [X] - y/ VarJX ro ] ) ) ( ^ Var [ V ro ] + p (VVar[y]-^/Var[y ro ])) 

Cov [X rc Y IO ] + p (Cov [XY] - Cov[X ro Y IO ]) 

A /Var[X ro ] Var[y ro ] + p ( \JVa.r[X] Var[Y]- ^/Var[X ro ] Var[y ro ]) 



Cov[X re y re ] Cov[jgy]yVar[X rc ] Var[-yU-Cov[X rc y rc ] VVar[X] Var[f] 



^/VarlX^] Var[Y ro ] ^/Var[X ro ] Var[Y ro ] ^/Var[X] Var[Y] 

Therefore, 

(4.6) p(X, Y) = p(X Ie , Y Ie ) + p [p[X, Y) - p(X Ie , F re )) , 

which is the realization property for phantom covariance. 
Let A be the set 

(4.7) A = {z G POH | a G [-1, 1], -(1 + a) < b < 1 - a}, 

i.e. it is the pointwise product 2 A — 1. We write A( + . + ) for the subset of A consisting of all phantom points 
whose real and phantom terms are positive; A/ +i _), A(__ + ), and A(___) arc defined respectively according to 
the positivity signs of the real term and the phantom term of their points. 

Proposition 4.3. Given any two p. r. v. 's X and Y , then p{X, Y) G A as defined in Equation (|4.7p . 

Proof. Using classical theory, both p(X,Y) and p(X rc ,Y rc ) range from —1 to 1; the proof is completed by 
Equation ([ITS]) . □ 



If p G ( or P £ A(_ _)), then the real and the phantom values of x — E[X] and y — E[Y] "tend" to 

have the same (or opposite, respectively) sign, and the size of \p\ provides a normalized measure of the extent 
to which this is true. In fact, always assuming that X and Y have positive variances, it can be shown that 
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p = 1 (or p = — 1) if and only if there exists a constant positive phantom number a, or negative, respectively, 
such that 

y — E[Y] = a(x — E[X]), for all possible numerical values (x,y). 

When p E A( +j _), or p <E Ar_ t+ \, then the real terms of x — E[X] and y — E[Y] "tend" to have the same (or 
opposite, respectively) sign opposite to that of their phantom terms. 

5. Moment generating functions 

The moment generating function, written m. g. f. for short, of the distribution function of a p. r. v. X 
(also referred to as the transform of X) is a phantom function Mx(() of a free phantom parameter ( G PH, 
defined by 

M X (C) = E[e^]. 
In more detail, the corresponding transform of X is given by: 



(5.1) M X (Q 



{J2 X e ^ x Px( x )i X discrete; 
J 7 e ^ x fx{ x )dx, ^continuous. 

Let £ = Crc + p Cph an d use Equation (|1.17[) to write 



E[e? x ] = E[e^ x ™+v^-^ x ^} 



= E[e?" x " +p(e& -e^ x «)]. 
Then, by Equation (|3.19p , one has the realization property for m. g. f. 

E[e< x ] = E[e<"*~] + p {e[J x } - E[e^ x -}) , 

and thus 

(5-2) M x (0 = M x (t Te ) + p(M x (0-M x ({ Ie )), 

where M x (£ ro )) and M X (Q) are standard (real) moment generating functions. 

Theorem 5.1 (Inversion property). The m.g.f. M x (£) completely determines the probability law of the 
random variable X . In particular, if M x (£) = M Y (£) for all then the random variables X and Y have the 
same probability law. 

This property is a rather profound mathematical fact that is used frequently in classical probability theory. 
In light of Equation (|5.2p . i.e. the realization property of M X (Q), this phantom property is derived directly 
from the known result for the standard (real) m.g.f., applied to each comportment, in classical probability 
theory J3J. 

Transform methods arc particularly convenient when dealing with a sum of p. r. v.'s, since it covers addition 
of independent p. r. v. to multiplication of transforms, as we now show. Let X and Y be independent p. r. v.'s, 
and let W = X + Y . The transform associated with W is, by definition, 

M W (Q = E[e^] = E[e« x+y )] = E[e^e^]; 

the last equality is due to Equation (|1.17p . 

Consider a fixed value of the parameter ( € PH. Since X and Y are independent, and e^ Y are also 
independent p. r. v.'s. Hence, the expectation of their product is the product of the expectations, and thus 

M W (Q = E[e^]E[e^] = M x (()M Y (0- 
By the same argument, if Xi, . . . , X n is a collection of independent p. r. v.'s, and W = X\ + ■ ■ ■ + X n , then 

M W (C) - M Xi (Q ■■■ M Xn (0- 
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5.1. Examples of moment generating functions. 

Example 5.2 (The transform of a linear function of a random variable). LetM x (Q be the transform 
associated with a p.r.v. X . Consider a new p. r. v. Y = uX + v for some !i,u6 PH. We then have 

M Y {() = E[e^ uX+v ^} = e^E[e uCX ] = e<M x (uQ. 

Example 5.3 (The transform of the binomial). Let X±, . . . ,X n be independent Bernoulli p.r.v. 's, cf. 
Examvle \S.18[ with a common parameter p, assigned to probability. Then, 

M x . (0 = (1 - p)e° c + pe 1<: = 1 - p + pe ? , for all i. 

The p. r.v.Y = X\ + • • • + X n is phantom binomial with parameters n £ N and p £ PH . Its transform is given 
by 

ikuo = (i-p+p e c r. 

Example 5.4 (The sum of independent Poisson random variables is Poisson). Let X and Y be 

independent Poisson p. r. v. 's with means p x and [i Y , respectively, and let W = X + Y . Then, 

M x (() = e"*^ -1 ), M Y (() = e^C^- 1 ), 

and 

M W (C) = M X (()M Y (() = efx^-Vehr^-V = ebx+Hv-X"'- 1 ). 

Thus, W has the same transform as a Poisson p. r. v. with mean fx x + \l y . By the uniqueness property of 
transforms, W is Poisson with mean fj. x + u Y . 

6. Limit theorems 

6.1. Some useful inequalities. Before getting to probability inequalities, we need furhter results about the 
weak order on PH, including its relations with the phantom absolute value as defined in Equation ()1.18D . 
We recall that ;j w k assumed satisfying Properties 11.131 

Remark 6.1. The classical relation z 2 = \z\ 2 does not always hold phantomly; we might have z 2 -< w k \ z \ 2 or 
z 2 )^wk \z\ 2 ■ Note that \z\ 2 is real while z 2 is phantom. It is easy to verify that z 2 = \z\ 2 holds iff b = —2a. 

Moreover, from a metric point of view, there are numbers that are "close" in the sense of the weak order 
^ w k, but very far in the sense of | |; for example assuming is the lexicographic order, a small increasing 
of e makes z\ = e+ pe greater than 22 = e + pb, but sill \z2\ > |zi|. 



The mismatch between and the | | , as addressed in Remark 16.11 yields different versions for phantom 
Markov inequalities, aiming to provide later a phantom version of the Chcbyshcv inequality 

Proposition 6.2 (Markov phantom inequalities). Given a p. r. v. X that takes only values ^ w k 0. Then 

(i) V(X >3 w k z) ^ w k , for any pseudo positive z £ PH, 

(ii) T(\X\ > \z\) ^ wk > for any z g PHj 



(iii) \V(\X\ > \z\)\ < 



for any z £ PH. 

Proof. (i) Fix a pseudo positive z £ PH and consider the random variable Y z defined by 

Y _ [ °! x ^wk z; 
\ z, X ^ wk z. 

It is seen that the relation Y z ^ w k X always holds and therefore, using Properties 11.131 (ii) for sums 
and products, E[Y Z ] ^ w k ELY]. On the other hand, E[Y Z ] — zP(Y z ~ w k z) = zP(X ^ w k z), from 
which we obtain zV(X ^ w k z) ^ w k E[X]. The proof is then completed by 11.131 (ii) for division. 

(ii) Apply part (i) to |X| and |z|, since both are positive. 

(iii) We need to prove that |E[|F 2 |]| < |E[|X|]|, or equivalently that < |E[|X|]| 2 - |E[|y z |]| 2 ; then the 
required inequality is obtained by part (ii). We prove the assertion for a discrete p. r. v.; the continues 
version is received similarly. 

Let y x £ {0, \z\} for the value of Y z apply to x £ X; accordingly p x (x) — p Y (y x ) for each x £ X. 
Wc write x and y x for \x\ and \y x \, respectively, assuming both are real nonnegatives. Then, denoting 
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p x , rc and p x iP h the real and the phantom component of p x , respectively, 

|E[|r z |]| 2 = E rc [|y z |] 2 + E rc [|^|]E ph [|y z |] + E ph [|y z |] 2 /2 

= Ex'.x" yx'yx"P x ,re(x')p x , IC (x") + Ex'.x" Vx'Vx"P x ,ie(x')p x ,ph(x") 

+ £*',*» yx>y X "P x , V h{x')v x , P h{x")/2 

= Ex'.x" yx'^"(p x ,rc(x')Px,ro(a;") + P x .re^'^JC ,ph(z") + P X ,ph(x')P x ,pb(x")/2), 

and |E[|X|]| 2 is expressed in the same way. 

Letting g x {x',x") = p X)le (x')p XtTe (x") + p XtIe (x')p XtPil (x") + p x , p h(x')p x iPh (x")/2, as it is derived 
from the absolute value, one observes that g x (x' , x") > 0. 

Putting all together, and considering the difference, we have 

|e[Ly|]| 2 - |E[|y|]| 2 = J2 W - yx'yx»)g x {x',x"), 

x' ,x" 

in which all components are > 0. Since x' > y x i and x" > j/x" then x'.t" — j/x'2/x" > 0, and thus the 
sum is > as desired. 

□ 

We write /z |x| and crj 2 for E[|X|] and Var[|X|], respectively, then have the phantom analogously to the 
Chebyshev inequality. 

Proposition 6.3 [Chebyshev phantom inequality). If X is a random variable with mean fi. x . and variance 
a 2 , then 

\x\ ' 



\V(\\X\-a lxl \>\z\)\ < 



for all z =/= 0. 



Proof. Consider the nonnegative random variable (\X\ — Mixi) 2 ano - a PPly the Markov inequality (iii) with 



to obtain 



\X\-nf\ > \w[ 



< 



E[|(m-M lxl ) 2 |] 



w 



Since, (\X\ — /i, x ,) 2 is a real nonnegative number, \(\X\ — fj, ]x] ) 2 \ = (\X\ — /i |x| ) 2 , and thus 



V[(\X\-a, x Y>\ W \ 



< 



m\x\ 



w 





















\w\ 2 



The derivation is completed by observing that the event (\X\ — /i |x| ) 2 > \w\ is identical to the event 
\X\ - M, xl I > M and 



\p{\\x\-a m \>\ w \)\ = |p((|x|- M|x| ) 2 >h 2 ) 



< 



□ 



An alternative form of the Chebyshev inequality is obtained by letting \w\ = ccr |x| ., where c is a real positive, 
which yields 

c 2 

\V(\\X\-n\>ca lxl )\ = 



(™ lxi y 



Thus, the probability that a random variable \X\ takes a value more than c times the standard deviations 
away from the mean u. x , is at most 1/c 2 . 

The Chebyshev inequality is generally more powerful than the Markov inequality (the bounds that it 
provides are more accurate), because it also makes use of information on the variance of X. Still, as usual, the 
mean and the variance of a random variable are only a rough summary of the properties of its distribution, 
and we cannot expect the bounds to be close approximations of the exact probabilities. 
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6.2. The weak law of large numbers. Consider a sequence Xl,A2,... of independent identically dis- 
tributed p. i. v.'s, each with mean \x and variance a 2 . Let 

S„ = Xi + • • • + X n 

be the sum of the first n of them. As in classical theory, phantom limit theorems are mostly concerned with the 
properties of S n and related p. r. v.'s, as n becomes very large. In fact, the realization property of phantoms 
provides the phantom analogues to these theorems in a trivial way. 
Because of the independence of X^s, we have 

Var[5 n ] = VarLYi] + • • • + Vax[X n ] = no 2 . 

Thus, the distribution of S n spreads out as n increases, and does not have a meaningful limit. The situation 
is different if we consider the sample mean 

, , _ X\ + ■ • • + X n _ S n 

n n 

which can also be written as 

(6.1) M n = M„, rc + p(Mn- M„, re ). 

A quick calculation, together with the independence, shows that 

2 

E[M n ]=/i, Var[M„] = — . 



n 



We apply Chebyshev inequality and obtain 



(6-2) |P(||M n |-// |M ,1 >e)| < 



o 2 



for any real e > 0. 



We observe that for any real fixed e > 0, the right-hand side of this inequality goes to zero as n increases. 
This form gives one way to approach phantom limit theorems. However, in the sequel, we focus on the way 
established by the realization property. This means that wc consider phantom probability for abstract events, 
or random variables. 

Next wc consider the phantom weak law of large numbers, stated below. It turns out that this law remains 
true even if the Xi have infinite variance, but a much more elaborate argument is needed, which we omit. The 
only assumption needed is that E[A;] is well-defined and finite. 

Theorem 6.4 (The weak law of large numbers (WLLN)). Let Xi,X2,... be independent identically 
distributed p. r. v. 's with mean fi. For every real e > 0, we have 

V(\M n — u\ > e) — ► 0, as n^oo, 

or equivalently 

V(\M n — ji\ < e) — > 1, as n — > oo. 

Proof. Recall that \M n — a\ — > iff both, re (M„ — a) — > and ph (M n — a) — » 0, cf. Lemma 11.301 and that 
"Pro is a standard (real) probability measure for any phantom probability measure V = V ro + p"P P h- Then, 
since \M n — fi\ < e is an inequality of real random variables, by the known WLLN for real probabilities, 
V r c(\M n — fi\ < e) — > 1 as n — > oo, which means V p h(\M n — u\ < e) — > 0, since V is a phantom probability 
measure. □ 



As in classical theory, the phantom WLLN states that for a large n, the "bulk" of the distribution of M n 
is concentrated near ^. That is, if wc consider a neighborhood around fi, which here is 2-dimensional, then 
there is a high probability that M n will fall in that neighborhood; asm oo, this probability converges to 1. 
Of course, if e is very small, we may have to wait longer (i.e., need a larger value of n) before we can assert 
that M n is highly likely to fall in that neighborhood. 

Corollary 6.5. Let Xi, X2, ■ ■ ■ be independent identically distributed p. r. v.'s with mean fi. For every real 
e > 0, we have \V(\M n — u\ < e)| — > 1, as n — > 00. 



6.3. The central limit theorem. We can interpret the WLLN as stating that M n converges to /x. However, 
since M±, M2, ... is a sequence of phantom random variables, not a sequence of phantom numbers, the meaning 
of convergence, in the phantom sense, has to be precise. A particular definition is provided below. To facilitate 
the comparison with the ordinary notion of convergence, wc also include the definition of the latter. 

Definition 6.6. Let X\, Xi, ... be a sequence of p. r. v. 's (not necessarily independent), and let z be a phantom 
number. We say that the sequence X n converges to z in probability, if for every real e > 0, we have 

lim P{\X n -z\>e)=0, 

n — >oo 

or equivalently. for every real S > and for every real e > 0, there exists some uq such that 

\P(\X n -z\>e)\ < 6, 

for all n > n . 

According to the weak law of large numbers, the distribution of the sample mean M n is increasingly 
concentrated in the near vicinity of the true mean /i. In particular, its variance tends to zero. On the other 
hand, the variance of the sum S„ = X\ + • • • + X n = nM„ is unbounded, and the distribution of S„ cannot 
be said to converge to anything meaningful. 

An intermediate view is obtained by considering the deviation S n — nfj, of S n from its mean n/x, and scaling 
it by a (real) factor proportional to 1/y/n. What is special about this particular scaling is that it keeps the 
variance, even though it is phantom, at a constant level. The central limit theorem asserts that the distribution 
of this scaled phantom random variable approaches a normal phantom distribution. 

More specifically, let X%, X2, ... be a sequence of independent identically distributed p. r. v.'s with mean \x 
and variance a 2 . We define 

S n — nfi X\ + ■ ■ ■ + X n — n\i 



(6.3) W n 
An easy calculation yields: 



(Ty/n 



and 



E[W n ] = E[Xl+ -+*»]-"/' = 0, 



_ Var[Xi+ ■■■ +A'„] 



Var[Jf!]+ ■■■ +Var[X„] a 2 ^ 

Theorem 6.7 (The phantom central limit theorem). Let Xi,X%, ... be a sequence of independent iden- 
tically distributed p. r. v. 's with common mean /j, and a finite variance a 2 , and let W n be defined as in Equation 
()6.3p . Then, the c. p. d. f. of W n converges to the standard normal c. p. d. f.; that is, for a given phantom value 
z G PH ; 



/27T Js W 

with S = {w G 7 W : w ^ w k z} assumed piecewise continuous and differentiable, in the sense that 

lim V(W n ^ wk z) = $(z), for every z G PH. 

n — >oc 

Proof. The proof is established on the standard central limit theorem, known for real distributions, cf. [HE]. 
We also use the fact that if z rc — > z 0irc and z p h — > Zoph as reals, then z — > Zq, and the properties of the 
standard phantom normal distribution are as addressed in Proposition 13.21] 
By phantom computations, that are already familiar to the reader, we have 

,„ ,n TIr SVi.rc TlfAre , ( S n Ufl S n Ie n/J, re \ 

(6.4) W n = — ! 7= h p 



Suppose z G W n , then by the classical central limit theorem, each component converges to the standard normal 
cumulative distribution function, and thus using the realization property of Equation (|3 . 26[) we get the desired. 

When z W n , apply the same argument to £,w n ( z )- CI - Equation (|3.3[) . for which $(z) = &{£,w n ( z )) by 
definition. □ 



:',<< 



The central phantom limit theorem is surprisingly general, maybe even more general than the known classical 
one, which is a private case of the phantom theorem. (Note that here the integration is performed along a 
path.) Besides independence, and the implicit assumption that the mean and variance are well-defined and 
finite, it places no other requirement on the distribution of the X iy even though they are phantoms, which 
could be discrete, continuous, or mixed random variables. 

This is of tremendous importance for several reasons, both conceptual and practical. On the conceptual 
side, it indicates that the sum of a large number of independent p.r. v.'s is approximately phantom normal. 
As such, it applies to many situations in which a random effect is the sum of a large number of small but 
independent random factors. Noise in many natural or engineered systems has this property. 

In a wide array of contexts, it has been found empirically that the statistics of noise are well-described by 
(real) normal distributions, and the central limit theorem provides a convincing explanation of this phenome- 
non. Here, we add another argument, recorded by the phantom term which might provide more information 
about the behavior of the noise. 

On the practical side, the phantom central limit theorem eliminates the need for detailed probabilistic 
models and for tedious manipulations of p. m. f.'s and p.d.f.'s. Rather, it reduces all the computations to a 
real familiar framework, and allows the calculation of specific probabilities by simply referring to the table of 
the standard normal distribution. Furthermore, these calculations only require knowledge about the phantom 
means and phantom variances. 

6.4. The strong law of large numbers. 

Theorem 6.8 (The strong law of large numbers (SLLN)). Let Xi, X2, ... be a sequence of independent 
identically distributed p. r. v. 's with mean u. Then, the sequence of sample means M n = (X\ + • • • + X„) / n 
converges to \x, with probability 1, in the sense that 

.. Xt + ---+X n \ 
lim = /! =1. 



Proof. Using the same argument as in the proof of Theorem 16. 41 "P re is a standard probability measure. By 
the classical SLLN, V re ( lim = u) = 1, and thus, since V is a phantom probability measure, 

\n— »co n / 

Pph ( lim Xl ~ l = a) =0. (Note that, for this purpose, the fact that the random variable may take 

Vri— too n J 

phantom values does not play a role; equivalently, the Xi can be viewed as random variables that take values 
in R 2 .) □ 

Consider a sequence of p. r. v.'s, X\, X2, ■ ■ ■ , (not necessarily independent) associated with the same prob- 
ability model. Let z be a phantom number. We say that X n converges to zq with probability 1 (or almost 
surely) if 

V ( lim X n =z )=l. 



In order to interpret the SSLN, one needs to use probabilistic phantom models in terms of sample spaces. 
The contemplated experiment is infinitely long and generates experimental values for each one of the p.r. v.'s 
in the sequence Xi, X2, ■ ■ ■ ■ Thus, one should rather think of the sample space O as a set of infinite sequences 
lo = (x\, X2, ■ ■ ■ ) of phantom numbers: any such sequence is a possible outcome of the experiment. Let us now 
define the subset A of £1 consisting of those sequences (x\,X2, ■ ■ • ) whose long-term average is /i, i.e., 

(xx,X2, ■ ■ •) S A lim — — — — — = a. 

n^oo n 

The SLLN states that most of the phantom probabilities are concentrated on this particular subset of Q. 
Equivalently, the collection of outcomes that do not belong to A (infinite sequences whose long-term average 
^ jj,) has probability zero. 

This means that the initial distortions of the probabilities become meaningless as n — > 00, as well as their 
phantom terms. (The latter have a special meaning when dealing with Markov chains and stochastic processes, 
cf. [5].) Moreover, in the long term, the contribution of the phantom term lessens and tends to zero. 

The difference between the weak and the strong law is subtle and deserves close scrutiny. The weak law 
states that the probability V(\M n — u\ > e) of a significant deviation of M n from \x goes to zero as n — > 00. Still, 
for any finite n, this probability can be positive and it is conceivable that once in a while, even if infrequently, 
M n deviates significantly from /i. The weak law provides no conclusive information on the number of such 
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deviations, but the strong law does. According to the strong law, and with probability 1, M n converges to \x. 
This implies that for any given e > 0, the difference \M n — fi\ will exceed e only a finite number of times. 
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